0% found this document useful (0 votes)

36 views17 pages

1 s2.0 S2590005623000164 Main

This study develops an AI-based mechanism to assess and predict depression levels among Bangladeshi university students using a hybrid scale derived from eight existing depression assessment tools. A total of 684 responses were collected, and machine learning models achieved high accuracies of up to 98.08% in classifying depression levels. The research highlights the importance of early diagnosis and aims to address the stigma surrounding mental health issues in Bangladesh.

Uploaded by

margut2330772

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views17 pages

1 s2.0 S2590005623000164 Main

Uploaded by

margut2330772

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

Array 18 (2023) 100291

Contents lists available at ScienceDirect

Array
journal homepage: www.elsevier.com/locate/array

AIDA: Artificial intelligence based depression assessment applied to

Bangladeshi students
Rokeya Siddiqua, Nusrat Islam, Jarba Farnaz Bolaka, Riasat Khan, Sifat Momen ∗
Department of Electrical and Computer Engineering, North South University, Dhaka, 1229, Bangladesh

ARTICLE INFO ABSTRACT

Keywords: Depression is a common psychiatric disorder that is becoming more prevalent in developing countries like
Depression assessment Bangladesh. Depression has been found to be prevalent among youths and influences a person’s lifestyle and
Machine learning thought process. Unfortunately, due to the public and social stigma attached to this disease, the mental health
Voting algorithm
issue of individuals are often overlooked. Early diagnosis of patients who may have depression often helps to
Explainable AI
provide effective treatment. This research aims to develop mechanisms to detect and predict depression levels
and was applied to university students in Bangladesh. In this work, a questionnaire containing 106 questions
has been constructed. The questions in the questionnaire are primarily of two kinds – (i) personal, and (ii)
clinical. The questionnaire was distributed amongst Bangladeshi students and a total of 684 responses (aged
between 19 and 35) were obtained. After appropriate consents from the participants, they were allowed to
take the survey. After carefully scrutinizing the responses, 520 samples were taken into final consideration.
A hybrid depression assessment scale was developed using a voting algorithm that employs eight well-known
existing scales to assess the depression level of an individual. This hybrid scale was then applied to the collected
samples that comprise personal information and questions from various familiar depression measuring scales.
In addition, ten machine learning and two deep learning models were applied to predict the three classes
of depression (normal, moderate and extreme). Five hyperparameter optimizers and nine feature selection
methods were employed to improve the predictability. Accuracies of 98.08%, 94.23%, and 92.31% were
obtained using Random Forest, Gradient Boosting, and CNN models, respectively. Random Forest accomplished
the lowest false negatives and highest F Measure with its optimized hyperparameters. Finally, LIME, an
explainable AI framework, was applied to interpret and retrace the prediction output of the machine learning
models.

1. Introduction to be highly prevalent among college and university students [6,7].

One survey reported that 24% of students suffer from both anxiety
Diagnosing depression is an extremely tricky and sensitive business and depression [8]. The COVID-19 pandemic contributed to a further
in psychiatry. Depression, frequently referred to as the ‘‘common cold’’ rise in the number of depression cases. According to one study, the
of psychopathology, is the most well-known psychiatric condition, COVID-19 pandemic has exacerbated depression among Bangladeshi
with a history that predates the study of psychiatry itself [1]. It is a students and has affected them in various ways, including increased
complicated medical illness that affects patients to varying degrees. sleep disorders and poor dietary habits [9]. The findings of these
Depressed patients exhibit a variety of symptoms that may impair their studies on Bangladeshi university students are concerning and require
ability to think, sleep, socialize, and even carry out routine tasks. appropriate action. As a step toward that, this study attempts to take
According to the World Health Organization (WHO), one in every the first step towards understanding the primary causes of depression
four individuals suffers from mental health disorders worldwide [2], among the students studying at the tertiary level in Bangladesh and
with over 280 million people of different ages suffering from de- to build a mechanism to predict the severity of depression. A new
pression [3]. In [4], it has been reported that mental diseases in scale for measuring depression levels has been developed, and machine
Bangladeshis range from 6.5 to 31.0% among adults and 13.4 to 22.9% learning techniques have been utilized to tackle this issue. Identifying
among children. A recent study claims that more than 7 million indi- depression at an early stage would undoubtedly prevent the situation
viduals in Bangladesh, predominantly urban and metropolitan areas, from worsening further.
suffer from depression and anxiety [5]. Depression has been found

∗ Corresponding author.
E-mail address: [email protected] (S. Momen).

https://doi.org/10.1016/j.array.2023.100291
Received 27 February 2023; Received in revised form 25 April 2023; Accepted 3 May 2023
Available online 10 May 2023
2590-0056/© 2023 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-
nc-nd/4.0/).
R. Siddiqua et al. Array 18 (2023) 100291

This study followed various measures to assess and predict depres- selection approaches have been used in this work to select the dominant
sion, including establishing an integrated set of questionnaires and a attributes. SVM, ANN and a wide range of ensemble models have been
new scale for measuring depression, collecting a private dataset, and used to classify the depressed and non-depressed samples. SVM and
analyzing and evaluating data. Following the creation of the question- ANN techniques achieved the highest accuracy and AUC of 77.1% and
naires, data were collected in two ways: (1) via a Google form and (2) 0.813, respectively. Su et al. [13] proposed an automatic system to
via a printed form. In both situations, a consent form was provided on forecast the depression of the Chinese elderly population using the
the first page, and individuals were required to sign it to participate in CLHLS survey dataset. KNN-based imputation framework has been
this work. Major contributions of this work are illustrated below: used to substitute the missing data samples in the preprocessing step.
Gradient-Boosted Decision Tree achieved the overall best performance
• In order to investigate depression levels amongst Bangladeshi with 75.9% accuracy and 0.63 AUC.
university students, a dataset has been collected using a question- Ryu and colleagues developed a machine learning-based depression
naire in both online and offline format. forecasting system for stroke survival patients [14]. NIHSS survey and
• A new scale combining eight well-known existing scales has been Hamilton depression prediction index were performed on 623 individ-
created to measure depression as either normal, moderate or uals from a medical center in Korea. SVM and KNN reported accuracies
extreme. A voting technique has been proposed to combine the of 77.5% and 73.3%, respectively. Haque et al. [15] developed an
eight existing scales. automated depression detection scheme for children using the Young
• Machine learning and deep learning algorithms have been ap- Minds Matter dataset. The Boruta approach has been used to select the
plied to classify three categories of depression. Hyperparameter important features. The Decision Tree classifier produced 95% accuracy
optimizers and feature selection methods are used to improve the and 0.99 precision.
performance of these models. There has been considerable work on predicting depression among
• LIME, an explainable AI tool, has been used to understand the Bangladeshi individuals using locally collected datasets. For instance,
final predictions provided by machine learning models. Choudhury and his team used deep learning and five machine learning
• The created models were cross-validated using an independently algorithms to predict depression among Bangladeshi undergraduate
collected dataset from a different time period. The results demon- students based on some basic questionnaires [7]. Recursive Feature
strate that the performance on the cross-validated dataset is com- Elimination with Cross-Validation and Random Forest Classification
parable to the performance on the test dataset, demonstrating the was used for selecting salient features. A total of 65 questions were
robustness of the models. included in their dataset, including 7 basic information, 16 depression-
related queries, 21 BDI and 21 DASS 21-BV (Bangla version) questions.
To the best of our knowledge, this is the first time a new integrated
After pre-processing, 577 participants were considered in this study.
scale based on a voting algorithm has been introduced, and a dataset of
Zulfiker and his colleagues [16] used six different machine learn-
Bangladeshi participants has been used to investigate and comprehend
ing classifiers and three distinct feature selection methods to predict
the level of depression among university students.
depression and extract relevant features. Synthetic Minority Oversam-
The following is the order in which this article is organized. Related
works are addressed in Section 2. Section 3 focuses discussion of the pling Technique (SMOTE) [17] and Burns Depression Checklist (BDC)
methodology adopted in this work. Section 4 examines the acquired were also used in this work. The AdaBoost technique with the Selec-
outcomes. Section 5 discusses how the outcomes achieved our research tKBest feature selection approach provided the best performance with
objectives. Section 6 discusses the limitation of this work and finally, 0.9256 classification accuracy.
Section 7 includes a conclusion with remarks on our future work. Ahmed and teammates used five machine learning classifiers on two
datasets to predict depression and anxiety [18]. In addition, two well-
2. Related work known depression and anxiety measuring scales were used in this work.
The CNN model attained the highest accuracies of 0.96 for anxiety and
Psychiatric disorders, especially depression, are highly prevalent 0.968 for depression detection employing 45 epochs. In [5], the authors
among young people with potentially severe complexities. Appropri- utilized a binary logistic model to predict depression for Bangladeshi
ate and early care for this disease is of paramount importance in university students. The authors have collected a private dataset from
dealing with its consequent impairing problems. However, it is not an online survey of 210 participants using the DASS-21 scale. According
always possible due to associating social stigma and misconceptions to this study, relationships with parents and friends, bedtime and di-
with mental affliction treatment [10]. Considerable work has been etary patterns, and family socioeconomic status are the primary factors
performed to detect depression using measuring scaling systems and of depression and anxiety. Moon et al. employed various machine learn-
artificial intelligence techniques. This section briefly discusses some of ing and ensemble approaches to predict the depression of employed
the notable works in this area. Bangladeshis [19].

2.1. Related work on depression prediction using machine learning 2.2. Related work on depression measuring scaling system

In recent years, with the immense growth of artificial intelligence In order to measure depression levels, clinicians develop a set
and machine learning techniques, many studies have been performed of questions that are given to an individual under assessment. The
to detect depression and anxiety-related mental disorders employing questions typically contain options that the individual needs to select.
a wide range of private and public datasets. Nemesure and his team Based on the responses provided, a score is given per question. The
utilized ensemble-based machine learning approaches to predict the total score obtained is then used to assess the level of depression
depression and anxiety of undergraduate students [11]. More than of an individual. Different scaling systems exist, which are used by
four thousand students from the University of Nice Sophia-Antipolis psychiatrists to diagnose the level of depression one is suffering [20].
participated in this study. The XGBoost model attained an AUC of A new integrated scale has been developed in this work using
0.67 for the validation set. Lee and Kim applied machine learning a majority voting technique on eight existing depression assessment
frameworks to predict the depressive behavior of American adults [12]. scales, i.e., BDI, HDRS, MADRS, EQ-5, PHQ-9, QIDS-SR, DASS-21 and
They used the United States National Health and Nutrition Examination K-10. These assessment scales have been successfully utilized in many
Survey (NHANES) dataset. NHANES conducted a PHQ-9 survey of works to detect depression and anxiety-associated mental disorders.
approximately 8600 people for ten years. Boruta and LASSO feature Some of these recent articles have been briefly presented below.

2
R. Siddiqua et al. Array 18 (2023) 100291

Table 1
A comparative analysis of related works.
Ref. Subject Sample size Age range (years) Data collection means Advantage Limitation
[11] UG students 4184 18–20 mostly Electronics health records Investigates depression and Low AUC score
anxiety disorder
[15] Children and adolescents 6310 4–17 Australian children and The dataset contains rich The samples are in the age
adolescent survey feature set range 4–17
[13] Adult chinese 1538 35–64 Chinese Longitudinal The survey data was Low accuracy
Healthy Longevity Study collected over a longer
time period
[14] Stroke survivors 65 47–79 Comparable-aged stroke Various cognitive and Dataset with low sample
patients with 4-weeks functional analysis were size
screening were polled performed
[12] Adults with hypertension 8628 ≥40 PHQ-9 survey The survey data was Low accuracy
collected over a
comparatively longer time
period
[7] UG students 577 – Beck Depression Scale and Multiple depression scales Low accuracy
DASS-21 were used
[16] Bangladeshi participants 604 16 and above Burns Depression Checklist Age distribution was wide Depression level has not
(BDC) been predicted.
[18] Bangladeshi women – 15–35 Lifestyle related questions. Multiple levels of Limited details on dataset
depression and anxiety distribution.
were measured.

Ustun [21] used the Beck Depression Inventory scale to determine to Burchert and his colleagues [28], users of smartphone health applica-
the depression level among the citizens of Turkey during COVID-19. A tions are eager to analyze everyday depression symptoms. The Patient
custom dataset with 1115 samples was collected through Google Forms Health Questionnaire (PHQ-9) was used for analyzing short-term mood
over a span of ten days. Among the survey participants, 47%, 25.7%, dynamics.
22.3%, and 5% were found to have minimal, mild, moderate, and Wang and teammates [29] aimed to create a mapping framework
severe depression, respectively. Saha and his colleagues [22] applied that connects the acromegaly quality of life (AcroQoL) assessment
the Kessler K-10 metric to determine the level of psychological distress survey to the EQ-5D-5L survey to provide a preference-based score
among Bangladeshi undergraduate students, particularly from Dhaka that could be applied to study the socioeconomic assessment. For this
city. The authors collected 180 records using an online survey over study, 424 adult individuals had an average EQ-5D-5L coefficient score
a span of two weeks. Their dataset contained 28 features comprising of approximately 0.80 with a 0.15 standard deviation. Mergen and
coronavirus-related stress factors. This study discovered that almost colleagues evaluated the validity and accuracy of the Quick Inventory
40% and 30.56% of students struggle with mild and moderate psycho- of Depressive Symptomatology (QIDS-SR16) and its American lexicon
logical distress, respectively. The Hamilton Scale for Anxiety (HAMA), version using the Turkish student participants [30]. Lu et al. evalu-
Hamilton depression rating scale (HDRS), and Beck’s Suicide Intent ated gender-based assessment invariance of the DASS-21 scale [31].
Scale were used in an observational study [23] with a set of predeter- Among 13,208 students from five different cities, 4985 men and 8223
mined questionnaires for Indian students. This study predicted anxiety, women participated in this study. The average indices for the DASS-21
despair, and suicide intent among undergraduate dental students, as depression, anxiety, and stress sub-scales were 2.17, 2.48, and 3.69,
respectively.
well as identified various stressors. Extended study hours, exhaustive
Table 1 provides a summary comparison of related works. The
workload, frequency of tests, competition among peers, and fear of
following observations have been made after careful inspection of the
failing were revealed to be statistically significant stressors.
related work – (i) there is an absence of using a hybrid scale to measure
Ibrahim et al. used Zagazig Depression Scale (ZDS) to measure
the level of depression, (ii) predictive models were typically applied
the existence of psychological illness in a group of Egyptian under-
on single and not on multi-scales, and (iii) most of the works have not
graduate students [24]. In this work, ZDS, an individually-assessed
produced an explanation of the prediction.
Arabic language interpretation of the Hamilton Rating Scale was used
to determine the pervasiveness of mental disorders symptoms. Partic-
3. Methodology and model architecture
ipants revealed an average ZDS coefficient of approximately 18 and a
maximum of 20. These ZDS scores indicate that more than 71% of the Fig. 1 shows the primary steps taken in this research. Following
survey participants suffer from mild depression. Guo [25] used HAMD- the creation of a questionnaire, it is distributed amongst Bangladeshi
17, CES-D, and WHOQOL-BREF evaluation metrics on undergraduate students and subsequently, a dataset is collected. The dataset is pre-
students with SSD to assess the impacts of electroacupuncture and processed so that it is in a format conducive to machine learning
cognitive behavioral therapy, or their combined effects, on mental algorithms to operate on. The dataset is then organized into two forms
disorders. – (i) dataset 1, which comprises 70 personal/basic questions of the
Ozawa conducted cross-sectional research on Japanese outpatients respondents and (ii) dataset 2, which contains all the questions (106
with ICD-10-defined depressive disorder [26]. Montgomery-Asberg De- questions), i.e., the personal questions as well as the clinical questions.
pression Rating Scale (MADRS) was used in this study with 100 de-
pressed outpatients and 36 healthy family members. In [27], the au- 3.1. Questionnaires creation
thors directed exploration on a significant number of people with
abrupt mood swings and subsyndromal depression symptoms. The One of the significant contributions of this work is to create a
integrated characteristics specifier was described as a score of 1 to 3 dataset of integrated depression assessment scales. The dataset is com-
on selected items of the Montgomery Asberg Depression Rating Scale posed of two kinds of questions: (i) Personal questions and (ii) Clinical
(MADRS) or Hamilton Depression Rating Scale (HAMD-17). According questions. After careful inspection of the literature, a total of nine

3
R. Siddiqua et al. Array 18 (2023) 100291

Fig. 1. Methodology of the work.

important areas were identified, which subsequently led to 70 per- 3.3. Dataset preprocessing
sonal/basic questions. Fig. 2 shows a mind map of the nine major
areas and relevant parameters. An additional 36 questions have been The data cleaning, organizing, visualizing and finally, handling of
created from various well-established depression assessment scales — the questions have been described exhaustively in this section.
BDI, HDRS, MADRS, EQ-5, PHQ-9, QIDS-SR, DASS-21 and K-10. Three
random Bangladeshi undergraduate students (who had not seen those 3.3.1. Data cleaning
questionnaires before) were asked to take an initial survey to determine
Some of the 684 collected records were incomplete, so they were
whether the selected questions were understandable to them. The
removed from the final dataset. Only 148 of 312 offline submissions
initial survey took 20–25 min to answer all questions. The volunteers
were considered after this removal process. The duplicate data checking
found several HDRS depression assessment scale questions confound-
and the feature-renaming process were conducted. Label and one-hot
ing, including three questions from the personal questionnaire. Conse-
encodings were applied to provide a numerical representation of the
quently, these HDRS questions were translated into the Bangla native
categorical features. For example, ‘Undergraduate’ and ‘Postgraduate’
language. Finally, the evaluation questionnaire was established.
were set to 1 and 2, respectively, in the feature named ‘degree.’ Null-
valued entries were replaced with their corresponding mean values.
3.2. Data collection
Min–max scaler was used to normalize features so that all numerical
features have an acceptable range.
The survey was conducted online using Google Forms and offline
(printed copy) over about nine and half weeks, i.e., from October 23,
2021, to December 28, 2021. A consent agreement document to partic- 3.3.2. Selecting questions from familiar depression assessment tools
ipate in the survey was attached on the first page. Participants were This work used eight depression measurement scales, i.e., BDI,
given a brief idea about the study and their voluntary participation HDRS, QIDS, MADRS, EQ5D, DASS21, PHQ-9 and K10. The selected
in the consent form. It was clearly stated that no traceable personal questions from 5 scales (i.e., BDI, HDRS, QIDS, EQ5D, and DASS21)
information would be collected and participants could stop taking the were taken verbatim. For the other three scales, if the questions overlap
survey and withdraw at any time. A total of 684 records were collected with the selected questions from the five scales, then they were not
— among those, 312 were offline and 372 were online submissions. included — thus, we avoid duplication of questions.
335 males and 349 females aged between 19 and 35 participated in To create each category, values were assigned to each question
the study. for that particular scale and scores were calculated by summing up

4
R. Siddiqua et al. Array 18 (2023) 100291

Fig. 2. Mind map of the personal questionnaires. The blue boxes show the nine major areas that were identified. (For interpretation of the references to color in this figure
legend, the reader is referred to the web version of this article.)

those values. The scales have a unique range for measuring depres- expressed using three common depression classification levels, i.e., Nor-
sion levels, primarily containing 4 to 6 classes (except EQ5D, which mal, Moderate and Extreme, to maintain consistency and achieve better
contains 2 classes). Finally, the range of the scales was normalized and results.

5
R. Siddiqua et al. Array 18 (2023) 100291

Table 2
Synopsis of the eight depression measuring scales used in this study.
Scales Levels of depression based on No. of questions Depression categories of
total score new scale
Beck Depression Inventory Normal (1–10) 21 Normal (0–13)
(BDI) Mild mood disturbance (11–16) Moderate (14–27)
Borderline clinical depression Extreme (over 27)
(17–20)
Moderate depression (21–30)
Severe depression (31–40)
Extreme depression (over 40)
Hamilton Depression Normal (0–7) 17 Normal (0–10)
Rating Scale (HDRS) Mild depression (8–16) Moderate (11–22)
Moderate depression (17–23) Extreme (over 22)
Severe depression (over 23)
The Quick Inventory of Normal (0–5) 16 Normal (0–8)
Depressive Mild depression (6–10) Moderate (9–17)
Symptomatology Moderate depression (11–15) Extreme (18–27)
(QIDS-SR16) Severe depression (16–20)
Very severe depression (21–27)
Montgomery and Asberg Depressive symptoms absent (0–8) 10 Normal (0–15)
Depression Rating Scale Mild (9–17) Moderate (16–30)
(MADRS) Moderate (18–34) Extreme (over 30)
Severe (35–60)
EQ-5D Worst (0) 5 Normal (0–8)
Best (100) Extreme (over 8)
Depression, Anxiety and Normal (0–9) 21 Normal (0–13)
Stress Scale (DASS21) Mild (10–13) Moderate (14–28)
Moderate (14–20) Extreme (over 28)
Severe (21–27)
Extremely Severe (over 27)
Patient Health Minimal depression (1–4) 9 Normal (0–7)
Questionnaire (PHQ-9) Mild depression (5–9) Moderate (8–16)
Moderate depression (10–14) Extreme (17–27)
Moderately severe depression
(15–19)
Severe depression (20–27)
Kessler Psychological Normal (10–19) 10 Normal (0–22)
Distress Scale (K10) Mild disorder (20–24) Moderate (23–30)
Moderate disorder (25–29) Extreme (31–50)
Severe disorder (30–50)

Fig. 3. Proposed depression assessment questionnaire and scale development procedure.

3.4. New depression assessment scale creation the feature space — this facilitates analyzing depression on a much
wider number of factors. This also improves the predictability of the
In this research, a new scale was created using the voting technique machine learning algorithms. To determine the level of depression for
on eight different depression measuring scales, discussed in Table 2. a new record, the new record is assessed against the eight depression
The eight scales do not use the same set of questions. One of the scales. The most frequent label (normal, moderate, extreme) serves as
critical motivations for combining the eight scales is that it increases the final depression level for this record. However, to handle equal

6
R. Siddiqua et al. Array 18 (2023) 100291

Fig. 4. Percentage distribution of respondent (a) depression level, (b) suicidal thoughts by gender.

votes, the lower depression level was taken into account. For example, depression dataset. According to this figure, women are more vulnera-
if both ‘normal’ and ‘moderate’ levels obtain the maximum votes and ble to both depression and suicidal thoughts than men. Depression was
are equal, then ‘normal’ was taken as the level of depression. Later, this found in about one-third (33.27%) of the total respondents, whereas
new scale was used as a truth value while applying supervised machine the male percentage was 28.84%. Fig. 4(b) shows the suicidal thoughts
learning techniques. The pseudocode of the voting algorithm is shown distributions across gender. Of the respondents who have suicidal
in algorithm 1. Finally, according to the new scale, 197 students were thoughts, 56.6% (120 out of 212) were females and 43.4% (92 out of
labeled normal, 191 students were marked moderately depressed, and 212) were males.
132 students were found extremely depressed. The depression detection Fig. 5 demonstrates the percentage of depression levels in those who
questionnaire and the scale creation procedures are illustrated in Fig. 3. have experienced emotional and sexual violence. According to Fig. 5,
71.23% and 37.73% of total participants admitted that they had faced
emotional and sexual violence, respectively, which provokes depression
Algorithm 1 Voting algorithm to measure depression
in their life.
1: procedure Voting Algorithm(𝑅𝑒𝑐𝑜𝑟𝑑 r)
2: 𝑠𝑐𝑎𝑙𝑒𝑠 = [BDI, HDRS, QIDS, MADRS, EQ5D, DASS21, PHQ-9, • Fig. 5(a) shows that both cases of ‘‘Strongly Agree’’ and ‘‘Agree’’
K10] have a large number of extremely depressed students due to
3: 𝑛𝑜𝑟𝑚𝑎𝑙𝐹 𝑟𝑒𝑞 ← 0 emotional violence where the normal percentage (1.42%) is very
4: 𝑚𝑜𝑑𝑒𝑟𝑎𝑡𝑒𝐹 𝑟𝑒𝑞 ← 0 low. However, the Extreme case is almost twice (27.36%) than
5: 𝑒𝑥𝑡𝑟𝑒𝑚𝑒𝐹 𝑟𝑒𝑞 ← 0 the moderate case (14.62%) according to the ‘‘Strongly agree’’
6: for i in scales do instances.
7: Determine depression level for Record 𝑟 • Fig. 5(b) shows that a small portion of respondents have experi-
8: 𝑑𝑒𝑝𝐿𝑒𝑣𝑒𝑙 ← depression level for Record 𝑟 enced sexual violence, but those who have been victims of sexual
9: if depLevel == normal then violence suffer from depression.
10: 𝑛𝑜𝑟𝑚𝑎𝑙𝐹 𝑟𝑒𝑞 ← 𝑛𝑜𝑟𝑚𝑎𝑙𝐹 𝑟𝑒𝑞 + 1 According to Fig. 5(b), if observed carefully in both ‘‘Strongly
11: else if depLevel == moderate then Agree’’ and ‘‘Agree’’ instances, no student was found without de-
12: 𝑚𝑜𝑑𝑒𝑟𝑎𝑡𝑒𝐹 𝑟𝑒𝑞 ← 𝑚𝑜𝑑𝑒𝑟𝑎𝑡𝑒𝐹 𝑟𝑒𝑞 + 1 pression. On top of that, the extreme depression class percentage
13: else is higher than the moderate percentage for sexual violence.
14: 𝑒𝑥𝑡𝑟𝑒𝑚𝑒𝐹 𝑟𝑒𝑞 ← 𝑒𝑥𝑡𝑟𝑒𝑚𝑒𝐹 𝑟𝑒𝑞 + 1
15: end if Fig. 6 shows the proportion of depression levels of survey respon-
16: end for dents who have financial hardship. According to Fig. 6, more than
17: if 𝑛𝑜𝑟𝑚𝑎𝑙𝐹 𝑟𝑒𝑞 ≥ 𝑚𝑜𝑑𝑒𝑟𝑎𝑡𝑒𝐹 𝑟𝑒𝑞 and 𝑛𝑜𝑟𝑚𝑎𝑙𝐹 𝑟𝑒𝑞 ≥ 𝑒𝑥𝑡𝑟𝑒𝑚𝑒𝐹 𝑟𝑒𝑞 half (57.41%) of students were found to have their families financially
then dependent on them due to their current financial difficulties and also
18: 𝑑𝑒𝑝𝑟𝑒𝑠𝑠𝑖𝑜𝑛𝐿𝑒𝑣𝑒𝑙 ← 𝑛𝑜𝑟𝑚𝑎𝑙 have moderate to extreme level of depression.
19: else if 𝑚𝑜𝑑𝑒𝑟𝑎𝑡𝑒𝐹 𝑟𝑒𝑞 > 𝑛𝑜𝑟𝑚𝑎𝑙𝐹 𝑟𝑒𝑞 and 𝑚𝑜𝑑𝑒𝑟𝑎𝑡𝑒𝐹 𝑟𝑒𝑞 ≥ According to Fig. 7(a), 323 out of 520 (62.11%) of regular smokers
𝑒𝑥𝑡𝑟𝑒𝑚𝑒𝐹 𝑟𝑒𝑞 then have moderate to severe depression. It is also worth noting that there
20: 𝑑𝑒𝑝𝑟𝑒𝑠𝑠𝑖𝑜𝑛𝐿𝑒𝑣𝑒𝑙 ← 𝑚𝑜𝑑𝑒𝑟𝑎𝑡𝑒 is a correlation between a lack of physical activity and depression.
21: else According to Fig. 7(b), 67.76% of participants who have never or
22: 𝑑𝑒𝑝𝑟𝑒𝑠𝑠𝑖𝑜𝑛𝐿𝑒𝑣𝑒𝑙 ← 𝑒𝑥𝑡𝑟𝑒𝑚𝑒 infrequently engaged in physical exercise have moderate to severe
23: end if depression.
24: return 𝑑𝑒𝑝𝑟𝑒𝑠𝑠𝑖𝑜𝑛𝐿𝑒𝑣𝑒𝑙 Empirical findings also show that families play a critical role in peo-
25: end procedure ple’s lives, to the point where they feel pressured by family members.
Participants were provided with six statements and asked to rate how
much pressure they feel from their family members.
3.5. Data visualization Fig. 8 indicates that the pressure towards career choice, higher stud-
ies and the lifestyle one adopt is slightly higher than the pressure one
Fig. 4 provides information on the percentage of depression levels feels towards marriage. However, a significant amount of respondents
and suicidal thoughts among male and female students of the collected feel extremely stressed with their role as a sibling or a child.

7
R. Siddiqua et al. Array 18 (2023) 100291

Fig. 5. Percentage of depression of survey respondents who have experienced (a) emotional violence and (b) sexual violence.

Since most of the participants are students, one source of depression into two parts — thus ensuring that the two datasets contain equal
originates from academic performance. Therefore, in order to under- representations from all classes. Dataset 1 contained only personal
stand the perception of the respondents in terms of their academic questions while dataset 2 included all questions, i.e., personal and
performances, seven statements were provided. The respondents used questions obtained from the depression assessment scales. Both datasets
a 5-point Likert scale to specify their agreement level with these were split for training (90%) and testing (10%) purposes. Therefore,
statements.
468 and 52 instances were assigned for the training and validation
Fig. 9 shows that academic workload plays a critical role in the
sets, respectively. Ten machine learning (Logistic Regression, Gradient
amount of stress they feel.
Boosting, K-Nearest Neighbor, Random Forest, Decision Tree, Sup-
3.6. Algorithms application port Vector Machine, Perceptron, Naive Bayes (Gaussian), Naive Bayes
(Multinomial), ZeroR Classifier and two deep learning (Artificial Neural
After preprocessing the data, 520 records were obtained in the Network (ANN), and Convolutional Neural Network (CNN)) techniques
final dataset. The stratified approach was used to divide the dataset were applied.

8
R. Siddiqua et al. Array 18 (2023) 100291

Each layer of the Artificial Neural Network (ANN) is made up of

multiple perceptron neurons. Since inputs are only processed forward,
ANN is also known as a feed-forward neural network. In general, it
has three distinct layers — input, hidden, and output layers and they
are responsible for accepting inputs, processing them, and producing
output, respectively. In this study, the Adam optimizer, mean squared
error (as loss function), ReLU, and sigmoid activation functions were
used to train over 15 epochs with a batch size of 5. Convolutional
neural networks (CNN) are being used in a variety of applications and
domains, particularly in the image and video processing works. In this
study, the softmax activation function, Adam optimizer, categorical
cross-entropy loss function, and two dense layers were used with a
batch size of 5 to train over 15 epochs. Please see Fig. 10 for the
architecture of the CNN model and Table 5 for its hyperparameters.

3.7. Hyperparameters optimization

Fig. 6. Percentage distribution of depression level with financial hardship.
In this work, five different hyperparameter optimization techniques
(randomized search CV, grid search CV, hyperopt, TPOT classifier
and Optuna) were used to find the best values for the hyperparame-
Logistic Regression uses the Linear Regression concept to solve ters for the proposed machine learning models. RandomizedSearchCV
classification problems. To solve the classification problem, it employs selects the optimal values by randomizing the search operation. In
the sigmoid function (also known as the logistic function) on linear GridSearchCV, all values of hyperparameters are searched. The Bayes
regression. Gradient Boosting employs an ensemble approach to com- Theorem-based technique and a Python library, Hyperopt, are used
bine many weak learners into a powerful predictive model. In contrast, in Bayesian Optimization. Natural selection, genetics concepts and a
K-Nearest Neighbor is an instance-based learning technique in which Python tool, Pipeline Optimization Tool (TPOT), are used in Genetic
the prediction of a query record is based on the votes provided by Algorithms. Optuna uses various optimization methods for finding the
the closest k training records. Random Forest uses an ensemble of optimal hyperparameter values automatically.
Decision Trees to make predictions. Naïve Bayes algorithm uses a
probabilistic approach to make the best prediction. Support Vector 3.8. Feature selection methods
Machines are well known for their ability to solve classification and
regression problems on both linear and non-linear data. In order to find out the salient features from both datasets, nine
The Perceptron is a two-class (binary) classification algorithm. The feature selection techniques were utilized, 1. Recursive Feature Elimi-
output is predicted by calculating a weighted sum of the inputs, i.e., ac- nation (RFE), 2. Univariate Selection (SelectKBest), 3. Fisher Score- Chi-
tivation, and a bias (set to 1), expressed as: squared Test, 4. Feature Importance (ExtraTreesClassifier), 5. Pearson
{ Correlation, 6. Mutual Information, 7. Mutual Information Regression,
1, if w ⋅ x + b > 0
𝑓 (𝑥) = (1) 8. Manual Approach (Uniqueness) and 9. Variance Threshold. Recursive
0, otherwise Feature Elimination selects the important features by removing the
In Eq. (1), 𝑤, 𝑥 and 𝑏 denote weights, inputs and bias, respectively. weakest feature individually until a specific number of features is
𝑓 (𝑥) symbolizes the activation of the Perceptron model. obtained. SelectKBest class is used in the Univariate Selection method
Gaussian Naive Bayes classifiers utilize Bayes’ Theorem and Gaus- where the best features are selected based on the 𝑘 highest score.
sian normal distribution to deduce the output. Interestingly, this ad- Chi-squared Test approach provides highly dependent features which
vanced model exhibits an efficient feature, i.e., all the pairs of classified comprise a higher ChiSquare coefficient. The Chisquare is obtained by
attributes do not rely upon each other. Consequently, the likelihood of the following as:
the independent features is measured as follows: ∑ (𝑂𝑖 − 𝐸𝑖 )2
( ) 𝑋𝑐2 = (4)
(𝑥𝑗 − 𝑚𝑦 )2 𝐸𝑖
1
𝑃 (𝑥𝑗 ∣ 𝑦) = √ exp − (2)
2𝑠2𝑦 where, 𝑂 and 𝐸 denote the observed and expected count, respectively.
2𝜋𝑠2 𝑦
The degree of freedom is expressed by 𝐶.
where, 𝑚 and 𝑠 denote the mean and standard deviation of 𝑋. It is Gini Importance is used to select the features in ExtraTreesClassifier
considered that, variance is independent of 𝑌 (i.e., 𝑠𝑖 ), or independent method and it selects the most significant 𝑘 features. One of the
of 𝑋𝑖 (i.e., 𝑠𝑘 ) or both (i.e., 𝑠). Multinomial Naive Bayes also follows two highly correlated features is dropped. The Pearson Correlation
essential steps, including transforming the samples of data into a set of coefficient is given by:
frequencies, generating likelihood coefficients by obtaining the respec- ∑𝑛
𝑖=1 (𝑥𝑖 − 𝑥)(𝑦
̄ 𝑖 − 𝑦) ̄
tive probabilities and calculating posterior probability using the Naive 𝑟= √ √∑ (5)
∑𝑛 𝑛
Bayesian equation as: ̄ 2
𝑖=1 (𝑥𝑖 − 𝑥) ̄2
𝑖=1 (𝑦𝑖 − 𝑦)
𝑃 (𝐵|𝐴) × 𝑃 (𝐴) where, 𝑟 denotes the correlation coefficient. Values of the 𝑥 and 𝑦
𝑃 (𝐴|𝐵) = (3)
𝑃 (𝐵) variables in a sample are given by 𝑥𝑖 and 𝑦𝑖 , respectively. 𝑥̄ and 𝑦̄
where, 𝐴 and 𝐵 denote discrete events, 𝑃 (𝐴|𝐵) is the probability of 𝐴 indicate mean of the 𝑥 and 𝑦 variables, respectively.
given 𝐵 is true and 𝑃 (𝐵|𝐴) signifies the probability of event 𝐵 given 𝐴 Mutual Information measures the mutual dependency between two
is true. 𝑃 (𝐴) and 𝑃 (𝐵) illustrate the independent probabilities of 𝐴 and random features. Mutual Info Regression also estimates mutual de-
𝐵, respectively. Finally, the output class is selected with the highest pendency for a continuous target variable. In the Uniqueness feature
posterior probability. ZeroR is a useful classifier that helps to deter- selection process, a significantly small number of unique value contain-
mine the baseline performance as a benchmark for other classification ing features were selected manually. In the Variance Threshold process,
methods predicting the majority class. all features whose variance did not meet the threshold were removed.

9
R. Siddiqua et al. Array 18 (2023) 100291

Fig. 7. Distribution of (a) smoking habit and (b) physical exercises of the respondents.

Fig. 8. Pressure from families.

Fig. 9. Pressure from studies.

4. Result analysis
𝑇𝑃
𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = (6)
This section discusses the results of the proposed automatic depres- 𝑇𝑃 + 𝐹𝑃
sion assessment system. 𝑇𝑃
𝑅𝑒𝑐𝑎𝑙𝑙 = (7)
Accuracy, precision, recall and F1 score are used to measure the 𝑇𝑃 + 𝐹𝑁
performance of various machine learning and deep learning models. 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 × 𝑟𝑒𝑐𝑎𝑙𝑙
Eq. (6) to (9) were used to determine the performance metrics. 𝐹 1 𝑠𝑐𝑜𝑟𝑒 = 2 × (8)
𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑟𝑒𝑐𝑎𝑙𝑙

10
R. Siddiqua et al. Array 18 (2023) 100291

Table 3
Performance metrics for different algorithms of dataset 1 and dataset 2.
Algorithms Datasets Accuracy Precision Recall F1 score
Dataset-1 50.00% 48.79% 50.00% 48.97%
Logistic Regression
Dataset-2 88.46% 88.50% 88.46% 88.42%
Dataset-1 50.00% 52.83% 50.00% 51.12%
Gradient Boosted Algorithm
Dataset-2 82.69% 84.63% 82.69% 83.04%
Dataset-1 63.46% 60.77% 61.54% 59.72%
K-Nearest Neighbor Algorithm
Dataset-2 78.85% 79.62% 79.62% 75.60%
Dataset-1 63.46% 62.67% 63.46% 62.83%
Random Forest
Dataset-2 90.38% 89.17% 88.46% 88.67%
Dataset-1 59.62% 63.14% 61.54% 61.44%
Decision Tree
Dataset-2 76.92% 77.54% 75.00% 75.45%
Dataset-1 57.69% 58.17% 57.69% 57.82%
Support Vector Machine
Dataset-2 86.54% 91.03% 86.54% 86.53%
Dataset-1 46.15% 65.21% 32.69% 25.27%
Perceptron
Dataset-2 76.92% 83.02% 75.00% 76.18%
Dataset-1 55.77% 41.07% 55.77% 47.23%
Naive Bayes (Gaussian)
Dataset-2 73.08% 73.46% 73.08% 70.47%
Dataset-1 61.54% 62.45% 61.54% 61.70%
Naive Bayes (Multinomial)
Dataset-2 86.54% 87.98% 86.54% 86.94%
Dataset-1 26.92% 07.24% 26.92% 11.42%
ZeroR
Dataset-2 26.92% 07.24% 26.92% 11.42%
Dataset-1 48.08% 50.50% 38.46% 30.69%
ANN
Dataset-2 69.23% 54.08% 65.38% 56.68%
Dataset-1 55.77% 58.08% 56.63% 57.72%
CNN
Dataset-2 92.31% 88.83% 87.86% 88.63%

Table 4
Accuracy for different algorithms on dataset 1 and dataset 2 after using various hyperparameter optimizers.
Algorithms Datasets Randomized GridSearch Bayesian Genetic Algorithms (%) Optuna (%)
SearchCV (%) CV (%) Optimization (%)
Dataset-1 48.08% 48.08% 48.08% 51.92% 50.00%
Logistic Regression
Dataset-2 90.38% 92.31% 90.38% 84.61% 88.69%
Dataset-1 63.36% 51.92% 55.77% 54.64% 53.85%
Gradient Boosting
Dataset-2 94.23% 79.90% 94.23% 89.20% 78.84%
Dataset-1 55.77% 55.77% 61.54% 55.77% 63.46%
K-Nearest Neighbor
Dataset-2 73.08% 75.00% 76.92% 73.08% 73.08%
Dataset-1 59.61% 61.54% 63.46% 57.69% 61.53%
Random Forest
Dataset-2 92.31% 92.31% 90.38% 92.31% 88.46%
Dataset-1 55.77% 53.85% 53.85% 57.69% 61.54%
Decision Tree
Dataset-2 92.31% 69.23% 67.31% 69.23% 71.15%
Dataset-1 57.69% 64.75% 58.90% 59.94% 57.90%
Support Vector Machine
Dataset-2 86.54% 86.54% 94.23% 86.54% 82.48%
Dataset-1 48.08% 48.64% 47.62% 49.64% 53.94%
Perceptron
Dataset-2 75.00% 84.62% 83.61% 76.32% 80.69%
Dataset-1 59.62% 59.62% 57.69% 55.77% 61.54%
Gaussian Naive Bayes
Dataset-2 78.85% 78.85% 80.77% 78.85% 78.85%
Dataset-1 51.92% 53.85% 55.77% 57.69% 53.85%
Multinomial Naive Bayes
Dataset-2 86.54% 86.54% 84.61% 82.69% 84.61%
Dataset-1 26.92% 26.92% 26.92% 26.92% 26.92%
ZeroR
Dataset-2 26.92% 26.92% 26.92% 26.92% 26.92%

Table 5
Hyperparameters of the CNN model.
Parameters Values/Types
Epoch 15
Initial learning rate 0.001
Batch size 5
Optimizer Adam
Loss function Categorical cross-entropy

Table 3 shows the performance of ML and DL algorithms when

𝑇𝑃 + 𝑇𝑁 applied to datasets 1 and 2. From Table 3 it is noticeable that KNN
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = (9)
𝑇𝑃 + 𝐹𝑃 + 𝑇𝑁 + 𝐹𝑁 and Random Forest algorithms provide the highest accuracy of 63.46%
where, 𝑇 𝑃 , 𝑇 𝑁, 𝐹 𝑃 and 𝐹 𝑁 are true positives, true negatives, false when applied to dataset 1. Convolutional Neural Network (CNN) and
positives and false negatives respectively. Random Forest accomplished the accuracy of 92.31% and 90.38%,

11
R. Siddiqua et al. Array 18 (2023) 100291

Fig. 10. Architecture of the CNN model.

As demonstrated in Tables 3 and 4, the performances of most of

the depression assessment models improve significantly with the tuning
of the hyperparameters. In dataset 2, the accuracy of the Gradient
Boosting and Perceptron models increased from 82.69% to 94.23%
and 76.92% to 84.62%, respectively, after optimizing hyperparameters.
Similarly, in dataset 1, the accuracy of the Support Vector Machine
increased from 57.69% to 64.75% after optimization. However, for
Random Forest Classifier, the accuracy remains the same, which is
63.46% for both cases. SVM model with GridSearchCV and Gradient
Boosting with RandomizedSearchCV attained the best accuracies for
dataset 1 and dataset 2, respectively.

4.2. Performance analysis of different algorithms using various feature

selection methods

The results obtained after taking the top 20 features and dropping
the lowest 20 features from both datasets are shown in Tables 6
Fig. 11. Accuracy vs. epochs graph of CNN for dataset 2.
and 7 respectively. From Table 6, it can be seen that, on dataset 1,
Gradient Boosting Algorithm and KNN provided 67.31% and 65.38%
accuracies after using the Pearson Correlation and SelectKBest method
respectively, on dataset 2. The precision of 65.21% and 91.03% on respectively. On dataset 2, 92.31% and 96.15% accuracies were found
datasets 1 and 2 respectively were achieved by Perceptron and Support from Logistic Regression and Random Forest using Pearson Correlation
Vector Machine. Recall of 63.46% and 88.46% were achieved on and Mutual Information methods respectively. Table 7 showed that
datasets 1 and 2 respectively using Random Forest. Logistic Regression on dataset 1, Gradient Boosting Algorithm provided 67.31% accu-
also has 88.46% recall. As expected, the baseline classifier, ZeroR, racy using the Pearson Correlation method. The Random Forest model
achieved the lowest performance on datasets 1 and 2. The proposed with Pearson’s Correlation coefficient-based feature selection technique
ML models were found to provide better results on dataset 2. attained the highest accuracy, i.e., 98.08%, for dataset 2.
Fig. 11 illustrates the accuracy with the change of epochs of the CNN
Fig. 12 illustrates the comparison of accuracies between various ML
model for dataset 2. As expected, the accuracy gradually improves with
techniques obtained before and after feature selection on both datasets
the increment of the epochs.
with the most important 20 and 50 features. In dataset 1, the accu-
4.1. Performance of different algorithms with hyperparameter optimization racy of Logistic Regression without feature selection is 50.00%, which
increased to 63.46% and 65.38% after selecting the top 20 and 50
This section discusses the performance of various classifiers with hy- features, respectively. For Gradient Boosting Algorithm, the accuracy
perparameter optimization techniques applied to them. From Table 4, without feature selection is 50.00%. However, the accuracy improved
we can notice that after using RandomizedSearchCV, GridSearchCV to 67.31% considering both the best 20 and 50 features. Similarly,
and BayesianOptimization on dataset 1, 63.36%, 64.75% and 63.46% in dataset 2, Gradient Boosting Algorithm attained 82.69% accuracy,
accuracies were obtained by Gradient Boosting, SVM and Random which increased to 90.38% and 88.54% after taking the dominant
Forest frameworks, respectively. In the case of dataset 2, Gradient 20 and 86 features, respectively. The accuracy of Random Forest was
Boosting Algorithm and Random Forest achieved 94.23% and 92.31% 90.38% which increased to 96.15% and 98.08% after considering the
accuracy with the RandomizedSearchCV optimizer. most important 20 and 86 features, respectively.

12
R. Siddiqua et al. Array 18 (2023) 100291

Table 6
Accuracy for different algorithms on dataset 1 and dataset 2 after using various feature selection methods of significant 20 features.
Algorithms Datasets Recursive Select K Fisher Extra Trees Pearson Mutual Mutual Info Manual Variance
Feature Best (%) Score Chi Classifier (%) Correlation Informa- Regression Uniqueness Threshold
Elimination square Test (%) tion (%) (%) (%) (%)
(%) (%)
Dataset-1 61.54 63.46 61.54 57.69 48.08 59.62 57.69 61.54 50.00
Logistic Regression
Dataset-2 88.46 88.46 84.62 88.46 92.31 90.38 80.77 61.54 88.46
Dataset-1 46.15 53.85 55.77 57.69 67.31 57.69 59.62 51.92 50.00
Gradient Boosting
Dataset-2 82.69 90.38 80.77 84.62 82.69 86.54 84.62 51.92 82.69
Dataset-1 51.92 65.38 65.38 57.69 57.69 59.62 61.54 55.77 63.46
K-Nearest Neighbor
Dataset-2 88.46 82.69 82.69 76.92 76.92 76.92 78.85 55.77 78.85
Dataset-1 61.54 59.62 61.54 53.85 61.54 65.38 63.46 51.92 63.46
Random Forest
Dataset-2 82.69 94.23 90.38 92.31 92.31 96.15 90.38 59.62 90.38
Dataset-1 44.23 53.85 55.77 55.77 55.77 63.46 55.77 61.54 59.62
Decision Tree
Dataset-2 78.85 73.08 90.08 78.85 75.00 75.00 76.92 59.62 76.92
Dataset-1 63.46 61.54 61.54 61.54 61.54 55.77 61.54 53.85 57.69
Support Vector Machine
Dataset-2 82.69 88.46 86.54 86.54 90.38 84.62 84.62 53.85 86.54
Dataset-1 46.15 53.85 34.62 48.08 34.62 36.54 51.92 44.23 46.15
Perceptron
Dataset-2 76.92 53.85 76.92 76.92 65.38 67.31 71.15 44.23 76.92
Dataset-1 53.85 61.54 59.62 59.62 55.77 53.85 55.77 61.54 55.77
Gaussian Naive Bayes
Dataset-2 78.85 86.54 80.77 88.46 73.08 82.69 82.69 61.54 73.08
Dataset-1 67.31 57.69 61.54 55.77 61.54 61.54 51.92 59.62 61.54
Multinomial Naive Bayes
Dataset-2 80.77 82.69 82.69 84.62 86.54 78.85 86.54 59.62 86.54
Dataset-1 26.92 26.92 26.92 26.92 26.92 26.92 26.92 26.92 26.92
ZeroR
Dataset-2 26.92 26.92 26.92 26.92 26.92 26.92 26.92 26.92 26.92

Table 7
Accuracy for different algorithms on dataset 1 (best 50 features) and dataset 2 (top 86 features) after using feature selection methods.
Algorithms Datasets Recursive Select K Fisher Extra Trees Pearson Mutual Mutual Info Manual Variance
Feature Best (%) Score Chi Classifier (%) Correlation Informa- Regression Uniqueness Threshold
Elimination square (%) tion (%) (%) (%) (%)
(%) Test (%)
Dataset-1 55.77 48.08 50.00 44.23 48.08 53.85 57.69 65.38 50.00
Logistic Regression
Dataset-2 92.31 90.38 90.38 88.46 92.31 88.46 88.46 88.46 88.46
Dataset-1 55.77 50.00 50.00 50.00 67.31 55.77 53.85 65.38 50.00
Gradient Boosting
Dataset-2 82.69 78.85 78.85 78.85 82.69 82.69 84.62 88.54 82.69
Dataset-1 63.46 50.00 50.00 48.08 57.69 55.77 51.92 57.69 63.46
K-Nearest Neighbor
Dataset-2 80.77 75.00 76.92 78.85 76.92 75.00 78.85 80.77 78.85
Dataset-1 61.54 65.38 57.69 65.38 59.62 57.69 55.77 63.46 63.46
Random Forest
Dataset-2 92.31 86.54 88.46 88.46 98.08 86.54 88.46 86.54 90.38
Dataset-1 59.62 57.69 61.54 61.54 53.85 67.31 57.69 57.69 59.62
Decision Tree
Dataset-2 67.31 69.23 73.08 73.08 73.08 73.08 75.00 67.31 76.92
Dataset-1 63.46 57.69 59.62 59.62 61.54 63.46 61.54 59.62 57.69
SVM
Dataset-2 88.46 84.62 84.62 88.46 90.38 88.46 88.46 88.46 86.54
Dataset-1 48.08 55.77 46.15 53.85 34.62 32.69 28.85 42.31 46.15
Perceptron
Dataset-2 90.38 76.92 90.38 67.31 65.38 80.77 63.46 73.08 76.92
Dataset-1 53.85 55.77 55.77 57.69 55.77 59.62 57.69 57.69 55.77
Gaussian Naive Bayes
Dataset-2 78.85 76.92 73.08 76.92 73.08 75.00 75.00 86.54 73.08
Dataset-1 63.46 59.62 65.38 59.62 61.54 61.54 55.77 65.38 61.54
Multinomial Naive Bayes
Dataset-2 88.46 84.62 84.62 86.54 86.54 90.38 86.54 86.54 86.54
Dataset-1 26.92 26.92 26.92 26.92 26.92 26.92 26.92 26.92 26.92
ZeroR
Dataset-2 26.92 26.92 26.92 26.92 26.92 26.92 26.92 26.92 26.92

Synopsis of the feature selection techniques for dataset 1 and dataset score for this specific sample, as illustrated in Fig. 13, because of its
2 are summarized in Table 8. irregular sleep pattern, suicidal and somatic symptoms, disappointment
Local Interpretable Model-Agnostic Explanations (LIME) is an effi- with the future and self-blaming attitude.
cient technique for comprehending black box machine learning mod- Table 9 compares the proposed depression detection system with
els [32]. This technique constructs a simpler surrogate model by locally similar works.
approximating the complex ML model. The surrogate model analyzes
the confined area of the individual prediction and formulates a logical 4.3. Cross validation on an independently collected dataset
explanation in that local region. Figs. 13 and 14 demonstrate the
depression prediction interpretation of an extreme case and a normal One of the challenges with this model is that it is difficult to apply
instance, respectively, provided by the LIME explainable AI framework. the model on other existing datasets. This is due to the fact that this
The Random Forest model with the Pearson correlation feature selec- model is trained on a dataset containing rich feature set while other
tion technique predicted extreme depression with a 0.88 confidence existing dataset does not possess such extensive feature set. Therefore,

13
R. Siddiqua et al. Array 18 (2023) 100291

Fig. 12. Accuracies before and after feature selection (a) dataset 1 (b) dataset 2.

Table 8
Summary of the feature selection techniques for dataset 1 and dataset 2.
Machine learning algorithm Dataset Highest accuracy obtained Highest accuracy (%)
ZeroR 1, 2 Same 26.92
Naive Bayes (Multinomial) 1 Considering top 20 features 67.31
Naive Bayes (Multinomial) 2 Considering top 86 features 90.38
Naive Bayes (Gaussian) 1 Considering top 20 features 61.54
Naive Bayes (Gaussian) 2 Considering top 86 features 88.46
Perceptron 1 Considering top 50 features 55.77
Perceptron 2 Considering top 86 features 90.38
Support Vector Machine 1 Same performance if top 20/top 50 features are considered 63.46
Support Vector Machine 2 Same performance if top 20/top 86 features are considered 90.38
Decision Tree 1 Considering top 50 features 67.31
Decision Tree 2 Considering top 20 features 90.08
Random Forest 1 Same performance if top 20/top 50 features are considered 65.38
Random Forest 2 Considering top 86 features 98.08
K-Nearest Neighbor 1 Considering top 20 features 63.46
K-Nearest Neighbor 1 Considering top 20 features 88.46
Gradient Boosting 1 Same performance if top 20/top 50 features are considered 67.31
Gradient Boosting 2 Considering top 20 features 90.38
Logistic Regression 1 Considering top 50 features 65.38
Logistic Regression 2 Same performance if top 20/top 86 features are considered 92.31

14
R. Siddiqua et al. Array 18 (2023) 100291

Fig. 13. Depression prediction interpretation of an extreme case of the LIME explainable AI.

Fig. 14. Depression prediction interpretation of a normal instance of the LIME explainable AI.

in order to validate, we collected a new, independent dataset (𝑛 = 74) 5. Discussion

using the same features but from a different time period. The initial
dataset was collected from 23 October, 2021 to 28 December, 2021 This study aims to identify the major reasons for depression among
while this dataset was collected from 8 April 2023 to 18 April 2023. Bangladeshi university students. Our research found several factors ex-
Four classifiers (the best performing ones) were applied on this dataset hibited by depressed university students. Most students are unaware of
for cross validation and the results display similar performance on the this mental disorder, and alarming 42.4% students consider committing
cross validated dataset as that of the test dataset. Thus, the models have suicide due to anxiety and stress. Experiencing emotional or sexual
been found to be reliable in dataset collected from different time period. violence and financial hardship are discovered as important factors that
Please see Table 10 for detailed results. contribute to depression. Besides, regular smoking, physical inactivity

15
R. Siddiqua et al. Array 18 (2023) 100291

Table 9
Comparison of the proposed depression prediction system with similar works.
Reference Dataset Number of participants Used scale Number of questions Best model Accuracy
[5] Custom dataset of 210 Modified DASS-21 21 Binary logistic model N/A
Bangladeshi students
[11] Undergraduate students from 4184 Electronic health records 59 XGBoost AUC = 0.79
the University of Nice
Sophia-Antipolis
[12] United States National 8628 PHQ-9 9 SVM 77.1%
Health and Nutrition
Examination Survey
(NHANES)
[13] Chinese Longitudinal 1538 Information on health status 8 Gradient Boosting 75.9%
Healthy Longevity Study and quality of life
(CLHLS)
[14] Korean adults 623 National Institutes of Health 13 SVM 77.1%
Stroke Scale (NIHSS)
[15] Australian children 6310 Australian Child and 667 XGBoost 95%
Adolescent Survey
[16] Bangladeshi participants 6.4 Burns Depression Checklist 55 AdaBoost 92.56%
(BDC)
[18] Bangladeshi women 623 Lifestyle related questions 30 CNN 96.8%
This work Custom dataset of 684 Combined scale 106 Random Forest 98.08%
Bangladeshi students

Table 10
Model performance on test set and cross-validation dataset.
Classifier Details on Test accuracy (%) Accuracy on independent Training time (ms) Model size (KB)
features/hyperparameters used dataset (cross-val) (%)
Random Forest Pearson Correlation 98.08 95.95 103.77 88.6
(drop 20 features)
Gradient Boosting RandomizedSearchCV 94.23 93.24 210 755.27 88.6
(Hyperparameter
Optimization)
Logistic Regression GridSearchCV 92.31 87.84 700.71 88.6
(Hyperparameter
Optimization)
Logistic Regression Recursive Feature 92.31 87.84 54.85 2.79
Elimination (drop
20 features)

and frequent social media usage have been found to be common traits 6. Limitation of the work
among depressed students.
Another objective of this research is to test if depression can be ac- This article presents a novel approach of measuring depression as
well as predicting depression using machine learning and deep learning
curately predicted with the newly created depression assessment scale
models and was applied to Bangladeshi university going students.
and using various machine learning and deep learning models. The new
However, the work has experienced a number of challenges:
depression assessment questionnaires have been created employing the
voting technique on eight well-known depression measuring scales. • The novel approach involved the use of a voting algorithm that
Various feature selection approaches and hyperparameter optimization was applied on eight well-known depression measuring scales.
techniques have been performed to enhance the performance of the Consequently, this needed a much more extensive feature set
prediction models. The accuracy is further increased by keeping the compared to the previous work that exists in the literature.
• Due to extensive feature set, the questionnaire had a total of 106
dominant features, i.e., removing irrelevant ones.
questions which were time consuming for participants to fill up.
The model proposed here has been highly accurate and reliable
• Other datasets available in literature had much smaller set of
compared to the notable works found in the literature, as demonstrated features. As a consequence, it is not possible to apply our models
in Table 9. Empirical results reveal that machine learning and deep on other existing datasets.
learning algorithms have been able to predict depression with high • This model was applied on university going students. Hence, it
performance. The training duration and model size of the classifiers may not be reliable in inferring predictions on individuals with
listed in Table 10 have been determined for the PC configuration shown different profiles (such as children, elderly, etc...)
below:
Processor: Intel (R) Core(TM) i7-6500U CPU @ 2.50 GHz 2.59 GHz, 7. Conclusion and future scope
RAM: 8 GB, System Type: 64-bit operating system, x64-based processor.
This research paper presents various machine learning and deep
Random forest has been found to produce the best result on the test learning methods for assessing the level of depression among
set. Random forest uses ensemble approach that combines the strengths Bangladeshi university students. In addition, the study explores various
of various decision trees and hence it is likely to have provided a better personal and social factors that negatively impact young people’s men-
result. tal health. Moreover, this work has significant importance in studying

16
R. Siddiqua et al. Array 18 (2023) 100291

various factors of suicidal activities among young people that have been [8] Hoque R. Major mental health problems of undergraduate students in a private
increasing recently, mostly due to depression. Our research created a university of Dhaka, Bangladesh. Eur Psychiatry 2015;30:1880.
[9] Sultana J, Elhum Uddin Quadery S, Amik FR, Basak T, Momen S. A data-driven
new depression measuring scale with three distinct levels, i.e., normal,
approach to understanding the impact of Covid-19 on dietary habits amongst
moderate and extreme, using the voting technique on the results of the Bangladeshi students. J Posit Sch Psychol 2022;6:11691–7.
eight recognized scales. A private dataset containing 684 participants [10] Conceição V, Rothes I, Gusmão R. The association between stigmatizing attitudes
has been collected following the consent of the participants. Next, towards depression and help seeking attitudes in college students. PLOS ONE
nine feature selection methods were used to find relevant and most 2022;17:1–14.
[11] Nemesure M, Heinz M, Huang R, Jacobson N. Predictive modeling of psychiatric
important features contributing to depression. In addition, 12 machine illness using electronic health records and a novel machine learning approach
learning, ensemble and deep learning algorithms were applied to with artificial intelligence. Sci Rep 2020;11:1–9.
predict depression automatically. Random Forest, Gradient Boosting [12] Lee C, Kim H. Machine learning-based predictive modeling of depression in
Algorithm and CNN were found to be the better models for evaluating hypertensive populations. PLOS ONE 2022;17:1–17.
[13] Su D, Zhang X, He K, Chen Y. Use of machine learning approach to predict
depression. Finally, hyperparameter optimization and feature selection
depression in the elderly in China: A longitudinal study. J Affect Disord
approaches were applied to enhance the efficiency of the prediction re- 2021;282:289–98.
sults. In the future, a comprehensive sample with more diverse cohorts [14] Ryu YH, et al. Prediction of poststroke depression based on the outcomes of
can be combined with the existing dataset. Meta-heuristic optimization machine learning algorithms. J Clin Med 2022;11.
techniques can be applied to find the best features from patients’ [15] Haque UM, Kabir E, Khanam R. Detection of child depression using machine
learning methods. PLOS ONE 2021;16:1–13.
extensive biometric markers and characteristics. Possible extensions of [16] Zulfiker MS, Kabir N, Biswas AA, Nazneen T, Uddin MS. An in-depth analysis
this work are to utilize the prowess of more advanced artificial intel- of machine learning approaches to predict depression. Curr Res Behav Sci
ligence frameworks, e.g., adversarial and sequential learning, domain 2021;2:1–12.
adaptation, etc. [17] Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: synthetic minority
over-sampling technique. J Artificial Intelligence Res 2002;16:321–57.
[18] Ahmed A, Sultana R, Ullas MTR, Begom M, Rahi MMI, Alam MA. A machine
CRediT authorship contribution statement learning approach to detect depression and anxiety using supervised learning.
In: Asia-Pacific conference on computer science and data engineering. 2020, p.
Rokeya Siddiqua: Conceptualization, Methodology, Formal anal- 1–6.
ysis, Investigation, Data curation, Writing – original draft, Writing [19] Moon NN, Mariam A, Sharmin S, Islam MM, Nur FN, Debnath N. Machine
learning approach to predict the depression in job sectors in Bangladesh. Curr
– review & editing, Visualization. Nusrat Islam: Conceptualization,
Res Behav Sci 2021;2:1–10.
Methodology, Formal analysis, Investigation, Writing – original draft. [20] Cassidy S, Bradley L, Bowen E, Wigham S, Rodgers J. Measurement properties
Jarba Farnaz Bolaka: Conceptualization, Methodology, Investigation. of tools used to assess depression in adults with and without autism spectrum
Riasat Khan: Conceptualization, Methodology, Formal analysis, Inves- conditions: A systematic review. Autism Res 2018;11:738–54.
[21] Ustun G. Determining depression and related factors in a society affected by
tigation, Writing – original draft, Writing – review & editing, Supervi-
COVID-19 pandemic. Int J Soc Psychiatry 2021;67:54–63.
sion. Sifat Momen: Conceptualization, Methodology, Formal analysis, [22] Saha A, Dutta A, Sifat RI. The mental impact of digital divide due to COVID-19
Investigation, Writing – original draft, Writing – review & editing, pandemic induced emergency online learning at undergraduate level: Evidence
Supervision. from undergraduate students from Dhaka City. J Affect Disord 2021;294:170–9.
[23] Bathla M, Singh M, Kulhara P, Chandna S, Aneja J. Evaluation of anxiety,
depression and suicidal intent in undergraduate dental students: A cross-sectional
Declaration of competing interest
study. Contemp Clin Dent 2015;6:215–22.
[24] Ibrahim AK, Kelly SJ, Glazebrook C. Reliability of a shortened version of the
The authors declare that they have no known competing finan- Zagazig Depression Scale and prevalence of depression in an Egyptian university
cial interests or personal relationships that could have appeared to student sample. Compr Psychiatry 2012;53:638–47.
[25] Guo T, Guo Z, Zhang W, Ma W, Yang X, Yang X, Hwang J, He X, Chen X,
influence the work reported in this paper.
Ya T. Electroacupuncture and cognitive behavioural therapy for sub-syndromal
depression among undergraduates: A controlled clinical trial. Acupunct Med
Data availability 2016;34:356–63.
[26] Ozawa C, et al. Resilience and spirituality in patients with depression and their
Data will be made available on request. family members: A cross-sectional study. Compr Psychiatry 2017;77:53–9.
[27] McIntyre RS, et al. The prevalence and illness characteristics of DSM-5-defined
‘‘mixed feature specifier’’ in adults with major depressive disorder and bipolar
References disorder: results from the International Mood Disorders Collaborative Project. J
Affect Disord 2015;172:259–64.
[1] Dunn G, Sham P, Hand D. Statistics and the nature of depression. Psychol Med [28] Burchert S, Kerber A, Zimmermann J, Knaevelsrud C. Screening accuracy of a
1993;23:871–89. 14-day smartphone ambulatory assessment of depression symptoms and mood
[2] Organization WH. The World Health Report: Mental disorders affect one in four dynamics in a general population sample: Comparison with the PHQ-9 depression
people. 2001. screening. PLoS One 2021;16:1–25.
[3] Organization WH. Depression. 2023, https://www.who.int/news-room/fact- [29] Wang K, et al. Mapping of the acromegaly quality of life questionnaire to
sheets/detail/depression, [Online, accessed: 22 April 2023]. ED-5D-5L index score among patients with acromegaly. Eur J Health Econ
[4] Hossain MD, Ahmed HU, Chowdhury WA, Niessen LW, Alam DS. Mental 2021;1–11.
disorders in Bangladesh: A systematic review. BMC Psychiatry 2014;14:1–8. [30] Mergen H, et al. Comparative validity and reliability study of the QIDS-SR16 in
[5] Arusha A, Biswas R. Prevalence of stress, anxiety and depression due to Turkish and American college student samples. Klin Psikofarmakol Bülteni-Bull
examination in Bangladeshi youths: A pilot study. Child Youth Serv Rev Clin Psychopharmacol 2011;21:289–301.
2020;116:1–6. [31] Lu S, Hu S, Guan Y, Xiao J, Cai D, Gao Z, Sang Z, Wei J, Zhang X, Margraf J.
[6] Chang J, Yuan Y, Wang D. Mental health status and its influencing factors Measurement invariance of the Depression Anxiety Stress Scales-21 across gender
among college students during the epidemic of COVID-19. J South Med Univ in a sample of Chinese university students. Front Psychol 2018;9:2064.
2020;40:171–6. [32] Ribeiro MT, Singh S, Guestrin C. ‘‘Why should I trust you?’’: Explaining the
[7] Choudhury AA, Khan MRH, Nahim NZ, Tulon SR, Islam S, Chakrabarty A. predictions of any classifier. In: International conference on knowledge discovery
Predicting depression in Bangladeshi undergraduates using machine learning. In: and data mining. 2016.
IEEE region 10 symposium. 2019, p. 789–94.

AI-Driven Depression Detection Systems
No ratings yet
AI-Driven Depression Detection Systems
26 pages
Depression Prediction Using Machine Learning: A Review
No ratings yet
Depression Prediction Using Machine Learning: A Review
11 pages
201-15-3650-Paper Report-Tariqul
No ratings yet
201-15-3650-Paper Report-Tariqul
33 pages
Depression Detection
No ratings yet
Depression Detection
5 pages
DepXGBoot: Depression Detection Using A Robust Tuned Extreme Gradient Boosting Model Generator
No ratings yet
DepXGBoot: Depression Detection Using A Robust Tuned Extreme Gradient Boosting Model Generator
12 pages
Predicting Student Depression Using The Naive Baye
No ratings yet
Predicting Student Depression Using The Naive Baye
12 pages
A Machine Learning Based Depression Analysis
No ratings yet
A Machine Learning Based Depression Analysis
6 pages
23 July 2024 - Comprehensive Review of Depression Detection Techniques Based On Machine Learning Approach
No ratings yet
23 July 2024 - Comprehensive Review of Depression Detection Techniques Based On Machine Learning Approach
25 pages
Implementation Paper (1) (AutoRecovered)
No ratings yet
Implementation Paper (1) (AutoRecovered)
5 pages
Analysis of Machine Learning Algorithms For
No ratings yet
Analysis of Machine Learning Algorithms For
4 pages
Depression
No ratings yet
Depression
96 pages
Cloud Computing Along With ML - Research - Paper
No ratings yet
Cloud Computing Along With ML - Research - Paper
6 pages
Whta Revels About Depression Level The Role of Multimodal Features at The Level of Interview Questions
No ratings yet
Whta Revels About Depression Level The Role of Multimodal Features at The Level of Interview Questions
14 pages
Depression Detection Using EI
No ratings yet
Depression Detection Using EI
7 pages
Depression Detection Using Multimodal Analysis With Chatbot Support
No ratings yet
Depression Detection Using Multimodal Analysis With Chatbot Support
7 pages
Depression Recognition Over Fusion of Visual and Vocal Expression Using Artificial Intelligence
No ratings yet
Depression Recognition Over Fusion of Visual and Vocal Expression Using Artificial Intelligence
8 pages
Retrieve
No ratings yet
Retrieve
8 pages
A Machine Learning Approach For Anxiety and Depression Prediction Using PROMIS Questionnaires
No ratings yet
A Machine Learning Approach For Anxiety and Depression Prediction Using PROMIS Questionnaires
8 pages
Python ML for Global Depression Detection
No ratings yet
Python ML for Global Depression Detection
2 pages
Comparison of The Performance of Machine.5
No ratings yet
Comparison of The Performance of Machine.5
10 pages
Automatic Diagnosis of Depression Based On Attention Mechanism and Feature Pyramid Model.
No ratings yet
Automatic Diagnosis of Depression Based On Attention Mechanism and Feature Pyramid Model.
20 pages
Predicting Stress, Anxiety and Depression Among The University Students of India Post-Covid
No ratings yet
Predicting Stress, Anxiety and Depression Among The University Students of India Post-Covid
7 pages
Mental Depression Detection of Pregnant Women Using Machine Learning Approach
No ratings yet
Mental Depression Detection of Pregnant Women Using Machine Learning Approach
8 pages
Unveiling The Connection Between Psychoactive Drug Use and Mental Health Using KDD Process
No ratings yet
Unveiling The Connection Between Psychoactive Drug Use and Mental Health Using KDD Process
10 pages
Electronics 11 01111
No ratings yet
Electronics 11 01111
20 pages
Comparison of The Performance of Machine Learning-Based Algorithms For Predicting Depression and Anxiety Among University Students in Bangladesh A Result of The First Wave of The COVID-19 Pandemic
No ratings yet
Comparison of The Performance of Machine Learning-Based Algorithms For Predicting Depression and Anxiety Among University Students in Bangladesh A Result of The First Wave of The COVID-19 Pandemic
10 pages
MFCC-based Recurrent Neural Network For Automatic Clinical Depression
No ratings yet
MFCC-based Recurrent Neural Network For Automatic Clinical Depression
14 pages
Machine Learning for Depression Detection
No ratings yet
Machine Learning for Depression Detection
4 pages
Enhancing Depressive Post Detection in Bangla - A Comparative Study of TF-IDF, BERT and FastText Embeddings
No ratings yet
Enhancing Depressive Post Detection in Bangla - A Comparative Study of TF-IDF, BERT and FastText Embeddings
16 pages
Depression Detection Methods Based On Multimodal Fusion of Voice and Text
No ratings yet
Depression Detection Methods Based On Multimodal Fusion of Voice and Text
13 pages
AI-Driven Depression Severity Prediction
No ratings yet
AI-Driven Depression Severity Prediction
15 pages
Towards Automatic Text-Based Estimation of Depression Through Symptom Prediction
No ratings yet
Towards Automatic Text-Based Estimation of Depression Through Symptom Prediction
14 pages
Artificial Intelligence-Based Prediction of Depression, Anxiety and Stress
No ratings yet
Artificial Intelligence-Based Prediction of Depression, Anxiety and Stress
7 pages
Student Depression Prediction Survey
No ratings yet
Student Depression Prediction Survey
3 pages
Major Paper Publication
No ratings yet
Major Paper Publication
10 pages
A Hybrid Deep Learning Framework and Dwarf Mangoos
No ratings yet
A Hybrid Deep Learning Framework and Dwarf Mangoos
14 pages
Information 16 00114
No ratings yet
Information 16 00114
23 pages
5.IEEE Trans Affect Compu Interpretation of Depression Detection Models Via Feature Selection Methods
No ratings yet
5.IEEE Trans Affect Compu Interpretation of Depression Detection Models Via Feature Selection Methods
52 pages
Research Paper2+
No ratings yet
Research Paper2+
7 pages
Additive Cross-Modal Attention Network ACMA For Depression Detection Based On Audio and Textual Features
No ratings yet
Additive Cross-Modal Attention Network ACMA For Depression Detection Based On Audio and Textual Features
11 pages
ML for Mental Health Prediction
No ratings yet
ML for Mental Health Prediction
8 pages
Depression Detection Using Text Face and Audio
No ratings yet
Depression Detection Using Text Face and Audio
19 pages
Cai 2020
No ratings yet
Cai 2020
12 pages
Sensors 22 09775 v2
No ratings yet
Sensors 22 09775 v2
28 pages
Vaishnavi 2022 J. Phys. Conf. Ser. 2161 012021
No ratings yet
Vaishnavi 2022 J. Phys. Conf. Ser. 2161 012021
8 pages
Fpsyt 14 1256571
No ratings yet
Fpsyt 14 1256571
8 pages
A Hybrid Model For Depression Detection Using Deep Learning
No ratings yet
A Hybrid Model For Depression Detection Using Deep Learning
10 pages
Forensis Review
No ratings yet
Forensis Review
19 pages
How Do Machine Learning Models Perform
No ratings yet
How Do Machine Learning Models Perform
26 pages
Survey On ML and DL in Health
No ratings yet
Survey On ML and DL in Health
6 pages
Depression Detection and Analysis Using Large Language Models On Textual and Audio-Visual Modalities
No ratings yet
Depression Detection and Analysis Using Large Language Models On Textual and Audio-Visual Modalities
12 pages
Context Deep Neural Network Model For Predicting Depression Risk Using Multiple Regression
No ratings yet
Context Deep Neural Network Model For Predicting Depression Risk Using Multiple Regression
11 pages
Depression Detection in Social Media A Comprehensive Review of Machine Learning and Deep Learning Techniques
No ratings yet
Depression Detection in Social Media A Comprehensive Review of Machine Learning and Deep Learning Techniques
30 pages
Research Paper FF
No ratings yet
Research Paper FF
18 pages
Prediction of Depression Severity Based On The Prosodic and Semantic Features With Bidirectional LSTM and Time Distributed CNN
No ratings yet
Prediction of Depression Severity Based On The Prosodic and Semantic Features With Bidirectional LSTM and Time Distributed CNN
15 pages
Machine Learning for Twitter Depression Detection
No ratings yet
Machine Learning for Twitter Depression Detection
16 pages
Perspectives Introduction PDF
No ratings yet
Perspectives Introduction PDF
18 pages
Week 3 Paragraphing - ESOL 100 - ASY
No ratings yet
Week 3 Paragraphing - ESOL 100 - ASY
27 pages
Barbie's Impact on Gender Perceptions
No ratings yet
Barbie's Impact on Gender Perceptions
4 pages
The 16PF Personality Questionnaire: A Synopsis
89% (9)
The 16PF Personality Questionnaire: A Synopsis
12 pages
Good Parenting For Toddlers
No ratings yet
Good Parenting For Toddlers
2 pages
Gatsby 2 Parte
No ratings yet
Gatsby 2 Parte
15 pages
Ej 1160611
No ratings yet
Ej 1160611
6 pages
FARS International Journal of Education, Social Science & Humanities
No ratings yet
FARS International Journal of Education, Social Science & Humanities
690 pages
Grade 3 Math: Exploring Symmetry
No ratings yet
Grade 3 Math: Exploring Symmetry
6 pages
Mental Health
No ratings yet
Mental Health
21 pages
Gaming's Impact on Child Health
No ratings yet
Gaming's Impact on Child Health
4 pages
English Skills for Contact Centers
No ratings yet
English Skills for Contact Centers
30 pages
English q2 w1
No ratings yet
English q2 w1
3 pages
Modules in Purposive Communication Week 1
100% (1)
Modules in Purposive Communication Week 1
14 pages
Edf 425 - Counselling in Special Settings
No ratings yet
Edf 425 - Counselling in Special Settings
19 pages
Observation Task 2: Teaching A Phonics Lesson
No ratings yet
Observation Task 2: Teaching A Phonics Lesson
6 pages
Thesis Help for "The Old Man and the Sea"
100% (3)
Thesis Help for "The Old Man and the Sea"
8 pages
Ed Principles of Adult Education 2025
No ratings yet
Ed Principles of Adult Education 2025
12 pages
Activity 2 COMMUNITY PROFILING
No ratings yet
Activity 2 COMMUNITY PROFILING
4 pages
UNIT - 3 Sales Presentations: Meaning of Sales Presentation
No ratings yet
UNIT - 3 Sales Presentations: Meaning of Sales Presentation
6 pages
The Social Work Skills Workbook 6th Edition Barry R Cournoyer Updated 2025
No ratings yet
The Social Work Skills Workbook 6th Edition Barry R Cournoyer Updated 2025
77 pages
Motivation Reconsidered-The Concept of Competence (pp.302 & pp.318)
100% (1)
Motivation Reconsidered-The Concept of Competence (pp.302 & pp.318)
37 pages
End of Days Penryn The End of Days Series Book 3 Susan Ee PDF Download
100% (1)
End of Days Penryn The End of Days Series Book 3 Susan Ee PDF Download
29 pages
Navigating Identity in Adolescence
No ratings yet
Navigating Identity in Adolescence
39 pages
1.1 Human Nervous System
No ratings yet
1.1 Human Nervous System
26 pages
DU SOL Fee Receipt for Yash
No ratings yet
DU SOL Fee Receipt for Yash
1 page
Culture Personality School
No ratings yet
Culture Personality School
9 pages
3 PB
No ratings yet
3 PB
20 pages
1 s2.0 S0149763424000411 Main
No ratings yet
1 s2.0 S0149763424000411 Main
11 pages

1 s2.0 S2590005623000164 Main

Uploaded by

1 s2.0 S2590005623000164 Main

Uploaded by

Array 18 (2023) 100291

Contents lists available at ScienceDirect

AIDA: Artificial intelligence based depression assessment applied to

ARTICLE INFO ABSTRACT

1. Introduction to be highly prevalent among college and university students [6,7].

Fig. 1. Methodology of the work.

Fig. 3. Proposed depression assessment questionnaire and scale development procedure.

Each layer of the Artificial Neural Network (ANN) is made up of

3.7. Hyperparameters optimization

Fig. 8. Pressure from families.

Fig. 9. Pressure from studies.

Table 3 shows the performance of ML and DL algorithms when

Fig. 10. Architecture of the CNN model.

As demonstrated in Tables 3 and 4, the performances of most of

4.2. Performance analysis of different algorithms using various feature

in order to validate, we collected a new, independent dataset (𝑛 = 74) 5. Discussion

You might also like