Rank biserial Stochastic Feature Embed Bivariate Kernelized Regressive Bootstrap
Aggregative Classifier for School Student Dropout Prediction
C.Rajagopal1, Dr.T. Balasubramanian2
1
Assistant Professor (Computer Science)
Sri Vidya Mandir Arts and Science College, Katteri, Uthangarai - 636 902.
2
Principal,
Sri Vidya Mandir Arts and Science College, Katteri, Uthangarai - 636 902.
1
[email protected],
[email protected]Abstract
Early and accurately predicting the students’ dropout enables schools to recognize the
students based on available educational data. The dropout prediction rate is a major concern of
education administrators for predicting the dropout potential of students. A novel technique
called Rank biserial Otsuka-Ochiai Stochastic Embedded Feature Selection based Bivariate
Kernelized Regressive Bootstrap Aggregative Classifier (RBOOSEFS-BKBAC) was designed to
improve student dropout by higher accuracy and minimal time consumption. First, proposed
RBOOSEFS-BKBAC technique performs preprocessing to perform the data normalization, data
cleaning, and duplicate data removal. After that, Rank biserial correlation is applied for feature
extraction from the raw dataset. Followed by, Otsuka -Ochiai Stochastic neighbor embedded
feature selection was carried out for selecting significant features of classification. Lastly,
classification was achieved using Bivariate Kernelized Regressive Bootstrap Aggregative
classification technique. The Bucklin voting scheme is applied to the Bootstrap Aggregative
technique for increasing prediction accuracy as well as minimize error. Simulation is performed
by Student-Drop-India2016 dataset on factors such as prediction accuracy, precision, recall, f-
measure, as well as time. RBOOSEFS-BKBAC provides improved performance in terms of
achieving higher prediction accuracy and minimal time compared with conventional methods.
Keywords: School Student Dropout Prediction, preprocessing, Rank biserial correlation-
based feature extraction, Otsuka -Ochiai Stochastic neighbor embedded feature selection,
Bivariate Kernelized Regressive Bootstrap Aggregative classification
1. Introduction
Predicting the student's at-risk of dropout offers various benefits to educators, students,
and the educational institution. It also indicates the teacher to build involvement when students
are at risk and help in time to enhance students’ performance. Recently, predictive models were
developed to predict students dropping out used as the input features. CLSA scheme integrates
Convolutional neural networks, LSTM, and Attention was introduced in [1] based on limited
features of data. The designed model was not efficient to further enhance accuracy based on a
deep features analysis. A Feature-weighted and Time-Series CNN (FWTS-CNN) is developed in
[2] to predict student dropout. Designed method increases the accuracy, but it failed to collect
student learning behavior features for minimizing the time complexity.
Self-Regulated Learning (SRL) Strategies were developed in [3] to measure the effect of
student dropout prediction. But the higher accuracy of prediction was not achieved. CONV-
LSTM was developed in [4] for predicting student dropout. However, significant feature
selection is not carried out for reducing complexity.
A new hybrid algorithm called Decision Tree and Extreme Learning Machine (DT-ELM)
was introduced in [5] for selecting the features with better classification capability. But it has
more time consumption for data classification. Different machine learning techniques were
developed in [6] for predicting dropouts based on socioeconomic data. However, the designed
technique failed to apply the larger databases.
Logistic Regression and the Input-Output Hidden Markov Model (IOHMM) were
developed in [7] for early dropout prediction. However, it failed to construct more interpretable
models of students’ behavior prediction. A collective attention network was developed in [8] for
learning the behaviors of the student to predict the dropout. However, an accurate prediction was
not performed.
The Dropout Risk validation was performed in [9] for the prediction of school students'
behavior. But the designed validation was not applied for larger samples. A simple and
interpretable procedure was introduced in [10] to predict the dropout-prone and fail-prone
students. However, it failed to include the analysis of the efficiency of academic results and
motivation.
An innovative temporal optimization approach was designed in [11] for reliable dropout
prediction. However, the time complexity analysis was not performed. Several machine learning
classifiers were developed in [12] to predict the possible dropout cases. However, it failed to
predict the student’s risk of dropping out based on similar students’ previous achievements. In
[13], a machine learning-based system was developed to enhance the dropout prediction.
However, it failed for increasing prediction model by using significant features.
Artificial neural network (ANN) algorithm is developed in [14] for predicting the school
students’ dropout. But the designed algorithm failed to achieve a more extensive and various
sample of students' data for achieving the higher prediction accuracy. An Uplift modeling
framework was designed in [15] to enhance the effectiveness of dropout prediction for higher
education institutions. But the machine learning model was not applied to improve the
performance of dropout prediction. Bayesian Neural Networks were introduced in [16] to
dropout prediction. But, network is not efficient to minimize complexity of prediction.
The Ensemble Stacking Classification method was developed in [17] to improve the
accuracy of dropout predictions. But it failed to perform the accurate prediction by increasing the
number of variations correlative feature and large dataset. End-to-end method was introduced in
[18] using CNN for predicting student dropout issue based on feature extraction and
classification. But it failed for enhancing prediction dropout with minimum time.
A supervised machine learning algorithm with the Hyperparameter Optimization was
introduced in [19] for student dropout prediction. But accuracy is not enhanced. Deep learning
algorithm is developed in [20] for building dropout prediction model. But it failed to perform the
more accurate dropout prediction models.
1.1 Contribution
To enhance student dropout accuracy, RBOOSEFS-BKBAC technique is introduced
based on the four different processes namely preprocessing, feature extraction, feature
selection, and classification.
The RBOOSEFS-BKBAC technique performs the preprocessing, Rank biserial
correlation-based feature extraction, and Otsuka -Ochiai Stochastic neighbor embedded
feature selection. By selected features, classification is performed for reducing student
dropout prediction time.
A Bivariate Kernelized Regressive Bootstrap Aggregative classification is applied in the
RBOOSEFS-BKBAC technique to analyze the selected features and classifies the data.
The Bucklin voting scheme is applied to a Bootstrap Aggregative classification to
minimize the incorrect classification and improves the accuracy.
At last, well-known experimentation is achieved for measuring RBOOSEFS-BKBAC
and existing methods. Experimental result of RBOOSEFS-BKBAC was highly effective
than existing methods.
The article is summarized by: Section 2 presents proposed RBOOSEFS-BKBAC work
design and all steps including the algorithm. Section 3 presents the experimental setup and
analysis are presented in section 4. The last section concludes the paper.
2. Methodology
Student dropout prediction helps the education sectors to identify students in appropriate
way. Accurate dropout prediction is still being a serious issue due to its impact on decreasing
higher education values and impact on the social environment. Therefore, a novel technique
called RBOOSEFS-BKBAC is introduced for improving the Student dropout prediction with
different processes. The RBOOSEFS-BKBAC technique improves the prediction performance
but also reduces time consumption.
Fig 1 provides RBOOSEFS-BKBAC for accurate dropout prediction by performing the
four different processes namely preprocessing, feature extraction, feature selection, and
classification. Initially, Dataset ‘ ’ includes a feature set and amount of
input data are collected. RBOOSEFS-BKBAC technique first performs the
data preprocessing. After that, feature extraction process was achieved using regression function.
Thirdly, feature selection is achieved for improving the classification performance and
minimizes time consumption. Finally, the classification is performed to predict the dropout.
RBOOSEFS-BKBAC was explained into subsequent subsection.
3.1 Preprocessing
Data Preprocessing in the proposed RBOOSEFS-BKBAC technique refers to the
initial steps applied to make data more suitable for classification. The preprocessing methods
include different processes namely data cleaning, normalization, and transformation. Since the
raw data collected from the real-world environment is highly vulnerable to a missing value,
include many errors causes less accuracy. Therefore, Preprocessing is an essential step for
minimizing the error and increasing the accuracy.
Consider data and amount of features‘ ’. First, the missing values of
features are handled with the mean value for those respective features in the database. Then the
data transformation is performed based on the normalization. The normalization is done based on
the feature scaling. The data normalization is performed as given below,
(1)
Where, indicates a data normalization, indicates an original feature value,
denotes a maximum feature value, denotes a minimum feature value. Finally, the
duplicate data are removed from the dataset
Algorithm 1 illustrates data preprocessing. First, amount of features as well as raw data
were composed via dataset. Then missing values are removed by normalizing the feature value.
Then the duplicate data are removed from the dataset and obtain the preprocessed data.
3.2 Rank biserial correlative feature extraction
After the data preprocessing, a proposed RBOOSEFS-BKBAC technique performs the
significant feature extraction for enhancing prediction accuracy as well as minimize prediction
time. Therefore, relevant and irrelevant features are identified for the next stage through the
correlation measure. Feature extraction was procedure of identification for reducing
dimensionality of the dataset. The proposed RBOOSEFS-BKBAC employs Rank biserial
correlation function to find correlated features.
Rank-biserial correlation coefficient was a statistical measure’ employed to identify
correlated features from the dataset. In order to detect important features, Correlation among
subset as well as features is calculated.
Consider ‘m’ amount of features ‘ ’ into big dataset
(2)
Where indicates features position , symbolizes student dataset.
By applying Rank-biserial correlation coefficient, the relationship between the features are
estimated as given below,
(3)
From (3), indicates rank-biserial, represents two features in the dataset, ‘ ’
indicates number of features. Correlation coefficient provides the binary values ‘1’ and ‘0’.
Binary value ‘1’ represents features were correlated to further processing. Binary value ‘0’
represent features are not correlated. The correlated features were accompanied input to the next
process in order to decrease the complexity. The algorithmic process of Rank biserial correlative
feature extraction is described as follows,
Algorithm 2 explains Rank biserial correlative feature extraction. The Rank biserial
correlation coefficient is applied to find the correlated features. The highly correlated features are
extracted and other features are removed. As a result, the complexity of prediction is minimized.
3.3 Otsuka-Ochiai stochastic neighbor embedded feature selection
After the Feature extraction, feature selection procedure was performed in proposed
RBOOSEFS-BKBAC technique for choosing more significant features to reduce dropout
prediction. Otsuka-Ochiai stochastic neighbor embedding technique is used for feature selection.
Let us consider the number of features Otsuka-Ochiai similar was used
to select significant features and remove the other futures. The similarity between the features
are estimated as given below,
(4)
Where ‘ ’ indicates an Otsuka-Ochiai similarity coefficient, indicates
a mutual dependence between the features , represents a squared score of ,
indicates a squared score of . Otsuka-Ochiai coefficient offers similarity value from 0 to 1. If
similarity is higher, then the feature is said to be a significant feature for prediction. Otherwise,
the features are not significant features from the dataset to minimize the complexity. The
algorithm process of Otsuka-Ochiai stochastic neighbor embedded feature selection is described
as given below,
Algorithm 1 illustrates feature selection is carried out with different processes. At
first, number of features is collected. After that, similarities between the two features are
measured. If the similarity is higher than the threshold, then the feature was important. On the
other hand, feature was unimportant. The important features are selected for data clustering.
This helps to minimize time consumption.
3.4 Bivariate Kernelized Regressive Bootstrap Aggregative classification
Lastly, classification is achieved in proposed RBOOSEFS-BKBAC technique using
Bivariate Kernelized Regressive Bootstrap Aggregative classification. Bootstrap aggregating is
developed for enhancing durability of statistical classification and regression. The ensemble
algorithm performs the accurate classification by constructing the weak learners. Weak learner
is difficult to offer perfect results. Conversely, an ensemble technique minimizes error.
Figure 2 given above illustrates an overall structural design of a Bivariate Kernelized
Regressive Bootstrap Aggregative classification technique to obtain an accurate prediction. The
ensemble technique considers training data ‘ are accompanied by input.
Ensemble learning technique uses ‘ ’ number of weak learners to classify the
given input data. The proposed technique uses bivariate regression tree for analyzing data as well
as classified into different classes. The bivariate regression tree consists of various nodes which
are indicated as: Root node attains feature analysis, every branch nodes indicates effect of
analysis, with every leaf node denotes class label. Here the bivariate represents the regression
function provides the two-class label.
Figure 3 bivariate kernelized regression tree
Figure 3 illustrates the bivariate regression tree which consists of one root node linked to
the child node through the branch node. In the root node, the analysis is performed between the
training and mean of class. The radial basis function kernel is applied to analyze the data and
mean as given below,
(5)
Where, denotes a kernel function, indicates the training data, denotes
mean of selective class. Kernel function offers similarity value among [0, 1]. Depend on
similarity value, data is intimated towards selective class and results are obtained at leaf node.
Then the weak classification results have some training errors. In order to increase the
classification accuracy, the weak learner results are summed.
(6)
From (6) ‘ ’ indicates strong classification results, denotes a weak classification
result. After combing the weak learners, the error rate is estimated among actual and predicted
classification. Error rate is formulated as given below,
(7)
Where, denotes an error rate, indicates the actual classification results,
denotes observed classification results. With utilizing Bucklin selection, then weak
learner were ranked based on error value. The results with a minimum error are ranked first than
the other. The higher ranked results are removed for further processing resulting it minimizes the
complexity and time consumption. The Bucklin voting is applied to find the majority votes of the
classified results.
(8)
Where indicates a final ensemble classification outcomes, denotes argument
of maximum utility, denotes majority vote of data whose decision is known to the
classifier. In other words, the results having majority votes are considered as final classification
results. Thus, every data were rightly restricted towards selective class by high accuracy with
lesser error. The algorithmic process of the Bivariate Kernelized Regressive Bootstrap
Aggregative Classification technique is described as given below,
Algorithm 4 illustrates Bivariate Kernelized Regressive Bootstrap Aggregative Classifier
to obtain the higher classification accuracy. The bagging ensemble classification technique
considers the number of weak learners to classify the data by using a Bivariate Kernelized
Regression tree. The root node in the regression tree analyzes the data with the mean value. The
higher similarity is categorized into a particular class. The weak learner’s results are summed
and apply the voting scheme. For each weak learner, the error rate is measured by lesser error.
Finally, the maximum votes of weak learners were obtained as final classification results.
3. Experimental Scenario
In this section, simulation of RBOOSEFS-BKBAC as well as CLSA [1] FWTS-CNN [2]
are implemented using Python. The Student-Drop-India2016 dataset
https://www.kaggle.com/imrandude/studentdropindia2016 is used for conducting the experiment.
This dataset is used for student dropout on various processes namely preprocessing, feature
extraction, feature selection, and classification. Dataset includes 15 attributes such as
continue_drop, student_id, and so on with 19100 instances. Among 15 attributes, significant
features and data are selected for prediction.
4. Performance Results Analysis
In this section, RBOOSEFS-BKBAC and CLSA [1] FWTS-CNN [2] is determined in
prediction accuracy, precision, recall, F-measure, as well as prediction time with respect
to various numbers of data.
4.1 Comparative analysis of prediction accuracy
It is estimated as proportion of amount of student data were accurately classified from
total number of student data.
(9)
Where, denotes a true positive, indicates a true negative, denotes a false
positive, denotes a false negative. Prediction accuracy is calculated in milliseconds (ms).
The experimental results of the prediction accuracy using the RBOOSEFS-BKBAC
technique over existing methods namely CLSA [1] and FWTS-CNN [2] are reported in the
above tables I. The observed performance results of the proposed RBOOSEFS-BKBAC
technique have been compared to two conventional methods with a different number of student
data. In the first iteration, 1000 student data are considered for conducting the experiment. By
applying the RBOOSEFS-BKBAC technique, the prediction accuracy is 94% whereas the
prediction accuracy of existing CLSA [1] and FWTS-CNN [2] is 90%, and 89% respectively.
RBOOSEFS-BKBAC is compared with CLSA [1] and FWTS-CNN [2]. Ten comparative
outcome represents student dropout prediction accuracy is increased by 4% using the
RBOOSEFS-BKBAC technique when compared to CLSA [1] and also increased by 6% when
compared to FWTS-CNN [2].
Figure 4 demonstrates prediction accuracy of various methods and existing approaches. As
shown in figure 4, the prediction accuracy curve is shown in the y-axis of the plot i.e. accurately
classified instances and amount of student data is specified as input to x-axis. Graphical plot
illustrates that the RBOOSEFS-BKBAC technique provides superior performance than the
conventional methods. This is due to the application of the Bivariate Kernelized Regressive
Bootstrap Aggregative classification technique. The Bucklin voting scheme is applied to improve
the higher classification accuracy. This shows the effectiveness of the RBOOSEFS-BKBAC
technique accurately detects dropout via every dropout instances towards data.
4.1 Comparative analysis of precision
Precision is measured as proportion of amount of student data were accurately classified
from entire amount of student data.
(10)
Where, denotes a true positive, denotes a false positive. Precision was
calculated by percentage (%).
Table II reports the performance analysis of the precision with respect to amount of data.
Amount of data is accompanied in ranges from 1000 to 10000. As revealed in table II, the
various results of the precision are obtained for each method. RBOOSEFS-BKBAC performs
better precision than the other conventional classification technique. This is confirmed through
statistical examination. With ‘ ’ student data are considered for experiment and the
precision of RBOOSEFS-BKBAC technique is ‘95.65 %’, and the precision of CLSA [1] and
FWTS-CNN [2] being ‘93.71%’, 93.02%. Similarly, the remaining runs are performed by
different amount of student data. Precision exhibit that the RBOOSEFS-BKBAC technique
achieves a higher precision by 2% when compared to the CLSA [1] and 3% when compared to
FWTS-CNN [2].
Figure 5 illustrates precision of student dropout prediction by various amounts of data
packets. From graph, RBOOSEFS-BKBAC provides higher precision and lesser error rate for
verifying student dropout data compared with CLSA [1] and FWTS-CNN [2]. RBOOSEFS-
BKBAC technique analyzes the student data and classifies the student data using Bivariate
Kernelized Regressive Bootstrap Aggregative classification to improve the accuracy.
4.2 Comparative analysis of Recall
It is calculated as proportion of number of true positives and false negatives. The recall is
estimated as given below,
(11)
Where, denotes a true positive, denotes a false negative. The recall is calculated by
percentage (%).
Table III and figure 6 reveals the performance results of recall versus different numbers
of data varies from 1000 to 10000. Amount of data is consumed in horizontal direction and the
performance analysis of recall is observed at the vertical axis. As revealed in figure 6, the
performances of a recall are improved using the RBOOSEFS-BKBAC technique. However, the
existing classification algorithms provide significantly lesser performances. The experiments are
conducted with ‘1000 numbers of student data involved in the recall measurement. By applying
RBOOSEFS-BKBAC, the recall rate is 97.77% and the recall rate of existing respectively.
Therefore, the percentage of recall was observed using CLSA [1] and FWTS-CNN [2] are
94.25% and 94.11% respectively. Similarly, nine various performance results are observed. The
overall comparison results inferred that the recall of data classification is considerably increased
using the RBOOSEFS-BKBAC technique by 2% and 3% compared with CLSA [1] and FWTS-
CNN [2]. The significant reason is to increase the true positive and minimize the false-negatives.
4.3 Comparative analysis of F-measure
It is measured based on precision and recall. It is formulated by,
(12)
The F-measure was calculated by percentage (%).
Table IV and figure 7 presents the performance results of the f-measure using three
methods RBOOSEFS-BKBAC technique, CLSA [1], and FWTS-CNN [2]. Among the three
techniques, the performance of the RBOOSEFS-BKBAC technique is considerably increased
than the existing methods. Let us consider the 1000 student data, the f-measure is calculated
based on precision and recall. By applying the RBOOSEFS-BKBAC technique, the obtained
value of f-measure is 96.09% whereas the performance results of f-measure are observed as
93.97% and 93.56% respectively. For different methods, ten different performances are observed
by various amounts of inputs. Average of ten comparisons indicates that the RBOOSEFS-
BKBAC technique increases the f-measure by 2% and 3% when compared to existing methods
CLSA [1] and FWTS-CNN [2].
4.5 Comparative analysis of Prediction time
Prediction time is referred by number of time consumed for predicting student dropout on
classification process. It is expressed by given below,
(13)
It is calculated by milliseconds (ms).
Table V presents the performance results of prediction time using three techniques
namely the RBOOSEFS-BKBAC technique, CLSA [1], and FWTS-CNN [2]. Among the three
techniques, the performance of the RBOOSEFS-BKBAC is lesser than CLSA [1], and FWTS-
CNN [2]. For example, when the number of data is , prediction time of RBOOSEFS-
BKBAC is whereas the prediction time of the other two existing methods CLSA [1] and
FWTS-CNN [2] are and . The observed statistical result confirms that the proposed
RBOOSEFS-BKBAC technique outperforms well than the existing methods. Similarly, various
results are observed for each method. As a result, the overall performance results of prediction
time are reduced using the RBOOSEFS-BKBAC technique by 11% compared to [1] and 18%
compared to [2] respectively.
Figure 8 demonstrates the convergence plot of prediction time of three different
algorithms namely the RBOOSEFS-BKBAC technique, CLSA [1], and FWTS-CNN [2]. As
shown in the graphical representation, the proposed RBOOSEFS-BKBAC outperforms well than
the conventional classification approaches. The significant reason behind the improvement of the
RBOOSEFS-BKBAC technique is the preprocessing, feature extraction, and feature selection.
Rank biserial correlation is applied for performing the feature extraction from the raw dataset.
Followed by, Otsuka -Ochiai Stochastic neighbor embedded feature selection is achieved for
selecting significant features for classification. With the selected features, classification is carried
out by Bivariate Kernelized Regressive Bootstrap Aggregative classification to reduce time. As a
result, prediction time of the RBOOSEFS-BKBAC technique is comparatively minimized.
5. Conclusion
The students’ dropout prediction is one of the most important performance indicators of
the educational environment. It is crucial and the reliable prediction promotes the student's
involvement in interactive activities in time. Therefore, a novel technique called RBOOSEFS-
BKBAC is developed to perform the reliable students’ dropout prediction with higher accuracy.
At first, the preprocessing is performed to transform the input dataset into readable form. Next,
feature extraction process is performed by applying Rank biserial correlation. Followed by the
significant feature selection process is carried out by applying the Otsuka -Ochiai Stochastic
neighbor embedding technique. At last, the student data classification is performed using
Bivariate Kernelized Regressive Bootstrap Aggregative classification for increasing higher
classification accuracy with minimum error. RBOOSEFS-BKBAC technique was evaluated
through in-depth experimentation and compares the results with two conventional classification
algorithms. Results have proved that the proposed RBOOSEFS-BKBAC technique increases the
performance of prediction accuracy, precision, and so on than conventional classification
techniques.
References
[1] Yafeng Zheng, Zhanghao Gao, Yihang Wang, Qian Fu, “MOOC Dropout Prediction Using
FWTS-CNN Model Based on Fused Feature Weighting and Time Series”, IEEE Access,
Volume 8, 2020, Pages 225324 – 225335
[2] Yafeng Zheng, Zhanghao Gao, Yihang Wang, Qian Fu, “MOOC Dropout Prediction Using
FWTS-CNN Model Based on Fused Feature Weighting and Time Series”, IEEE Access,
Volume 8, 2020, Pages 225324 – 225335
[3] Pedro Manuel Moreno-Marcos, Pedro J. Muñoz-Merino, Jorge Maldonado-Mahauad, Mar
Pérez-Sanagustín, Carlos Alario-Hoyos, Carlos Delgado Kloos, “Temporal analysis for dropout
prediction using self-regulated learning strategies in self-paced MOOCs”, Computers &
Education, Elsevier, Volume 145, 2020, Pages 1-41
[4] Ahmed A. Mubarak, Han Cao, Ibrahim M. Hezam, “Deep analytic model for student dropout
prediction in massive open online courses”, Computers and Electrical Engineering, Volume 93,
2021, Pages 1-14
[5] Jing Chen, Jun Feng, Xia Sun, Nannan Wu, Zhengzheng Yang and Sushing Chen, “MOOC
Dropout Prediction Using a Hybrid Algorithm Based on Decision Tree and Extreme Learning
Machine”, Mathematical Problems in Engineering, Hindawi, Volume 2019, March 2019, Pages
1-11
[6] Francisco A. da S. Freitas, Francisco F. X. Vasconcelos, Solon A. Peixoto, Mohammad
Mehedi Hassan, M. Ali Akber Dewan and Victor Hugo C. de Albuquerque and Pedro P.
Rebouças Filho, “IoT System for School Dropout Prediction Using Machine Learning
Techniques Based on Socioeconomic Data”, Electronics, Volume 9, 2020, Pages 1-14
[7] Ahmed A. Mubarak, Han Cao & Weizhen Zhang, “Prediction of students’ early dropout
based on their interaction logs in online learning environment”, Interactive Learning
Environments, 2020, Pages 1-21
[8] Jingjing Zhang, Ming Gao, Jiang Zhang, “The learning behaviours of dropouts in MOOCs: A
collective attention network perspective”, Computers & Education, Elsevier, Volume 167, 2021,
Pages 1-15
[9] Michael G. Vaughn, Greg Roberts, Anna-Maria Fall, Kristen Kremer, Leticia Martinez,
“Preliminary validation of the dropout risk inventory for middle and high school students”,
Children and Youth Services Review, Volume 111, 2020, Pages 1-8
[10] Josep Figueroa-Cañas and Teresa Sancho-Vinuesa, “Early prediction of dropout and final
exam performance in an online statistics course”, IEEE Revista Iberoamericana de Tecnologias
del Aprendizaje , Volume 15, Issue 2, 2020, Pages 86 - 94
[11] Fernando Jiménez, Alessia Paoletti, Gracia Sánchez, Guido Sciavicco, “Predicting the Risk
of Academic Dropout with Temporal Multi-Objective Optimization”, IEEE Transactions on
Learning Technologies, Volume 12, Issue 2, 2019, Pages 225 – 236
[12] Janka Kabathova and Martin Drlik, “Towards Predicting Student’s Dropout in University
Courses Using Different Machine Learning Techniques”, Applied Science, Volume 11, 2021,
Pages 1-19
[13] Sunbok Lee and Jae Young Chung, “The Machine Learning-Based Dropout Early Warning
System for Improving the Performance of Dropout Prediction”, Applied Science, Volume 9,
2019, 9 Pages 1-14
[14] Ali Çetinkaya and Ömer Kaan Baykan, “Prediction of middle school students’ programming
talent using artificial neural networks”, Engineering Science and Technology, an International
Journal, Elsevier, Volume 23, Issue 6, 2020, Pages 1301-1307
[15] Diego Olaya, Jonathan Vásquez, Sebastián Maldonado, Jaime Miranda, Wouter Verbeke,
“Uplift Modeling for preventing student dropout in higher education”, Decision Support
Systems, Elsevier, Volume 134, 2020, Pages 1-38
[16] Yuki Mae, Wataru Kumagai, Takafumi Kanamori, “Uncertainty propagation for dropout-
based Bayesian neural networks”, Neural Networks, Elsevier, Volume 144, 2021, Pages 394-406
[17] Nindhia Hutagaol and Suharjito, “Predictive Modelling of Using Ensemble Classifier
Method in Higher Education”, Predictive Modelling of Student Dropout Using Ensemble
Classifier Method in Higher Education, Volume 4, Issue 4, 2019, Pages 206-211
[18] Lin Qiu, Yanshen Liu, Quan Hu & Yi Liu, “Student dropout prediction in massive open
online courses by convolutional neural networks”, Soft Computing, Springer, Volume 23, 2019,
Pages 10287–10301
[19] Theodor Panagiotakopoulos, Sotiris Kotsiantis, Georgios Kostopoulos, Omiros Iatrellis, and
Achilles Kameas, “Early Dropout Prediction in MOOCs through Supervised Learning and
Hyperparameter Optimization”, Electronics, Volume 10, 2021, Pages 1-11
[20] Wanli Xing and Dongping Du, “Dropout Prediction in MOOCs: Using Deep Learning for
Personalized Intervention”, Journal of Educational Computing Research, Volume 57, Issue 3,
2019, Pages 1-24