0% found this document useful (0 votes)

60 views27 pages

Rajagopal - Phase 2 - Journal Plagiarism Fully Completed

The document proposes a new technique called Rank biserial Stochastic Feature Embed Bivariate Kernelized Regressive Bootstrap Aggregative Classifier (RBOOSEFS-BKBAC) to improve the accuracy of predicting student dropout. The technique uses four processes: 1) preprocessing the data through normalization, cleaning, and removing duplicates, 2) feature extraction using rank biserial correlation, 3) feature selection using Otsuka-Ochiai Stochastic neighbor embedding, and 4) classification using Bivariate Kernelized Regressive Bootstrap Aggregative classification with Bucklin voting to increase accuracy and minimize errors. Experimental results on a student dataset show RBOOSEFS-BKBAC achieves higher prediction accuracy and less

Uploaded by

svmc bca

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

60 views27 pages

Rajagopal - Phase 2 - Journal Plagiarism Fully Completed

Uploaded by

svmc bca

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 27

Rank biserial Stochastic Feature Embed Bivariate Kernelized Regressive Bootstrap

Aggregative Classifier for School Student Dropout Prediction

C.Rajagopal1, Dr.T. Balasubramanian2

1
Assistant Professor (Computer Science)
Sri Vidya Mandir Arts and Science College, Katteri, Uthangarai - 636 902.
2
Principal,
Sri Vidya Mandir Arts and Science College, Katteri, Uthangarai - 636 902.
1
[email protected], [email protected]
Abstract

Early and accurately predicting the students’ dropout enables schools to recognize the
students based on available educational data. The dropout prediction rate is a major concern of
education administrators for predicting the dropout potential of students. A novel technique
called Rank biserial Otsuka-Ochiai Stochastic Embedded Feature Selection based Bivariate
Kernelized Regressive Bootstrap Aggregative Classifier (RBOOSEFS-BKBAC) was designed to
improve student dropout by higher accuracy and minimal time consumption. First, proposed
RBOOSEFS-BKBAC technique performs preprocessing to perform the data normalization, data
cleaning, and duplicate data removal. After that, Rank biserial correlation is applied for feature
extraction from the raw dataset. Followed by, Otsuka -Ochiai Stochastic neighbor embedded
feature selection was carried out for selecting significant features of classification. Lastly,
classification was achieved using Bivariate Kernelized Regressive Bootstrap Aggregative
classification technique. The Bucklin voting scheme is applied to the Bootstrap Aggregative
technique for increasing prediction accuracy as well as minimize error. Simulation is performed
by Student-Drop-India2016 dataset on factors such as prediction accuracy, precision, recall, f-
measure, as well as time. RBOOSEFS-BKBAC provides improved performance in terms of
achieving higher prediction accuracy and minimal time compared with conventional methods.

Keywords: School Student Dropout Prediction, preprocessing, Rank biserial correlation-

based feature extraction, Otsuka -Ochiai Stochastic neighbor embedded feature selection,
Bivariate Kernelized Regressive Bootstrap Aggregative classification
1. Introduction

Predicting the student's at-risk of dropout offers various benefits to educators, students,
and the educational institution. It also indicates the teacher to build involvement when students
are at risk and help in time to enhance students’ performance. Recently, predictive models were
developed to predict students dropping out used as the input features. CLSA scheme integrates
Convolutional neural networks, LSTM, and Attention was introduced in [1] based on limited
features of data. The designed model was not efficient to further enhance accuracy based on a
deep features analysis. A Feature-weighted and Time-Series CNN (FWTS-CNN) is developed in
[2] to predict student dropout. Designed method increases the accuracy, but it failed to collect
student learning behavior features for minimizing the time complexity.

Self-Regulated Learning (SRL) Strategies were developed in [3] to measure the effect of
student dropout prediction. But the higher accuracy of prediction was not achieved. CONV-
LSTM was developed in [4] for predicting student dropout. However, significant feature
selection is not carried out for reducing complexity.

A new hybrid algorithm called Decision Tree and Extreme Learning Machine (DT-ELM)
was introduced in [5] for selecting the features with better classification capability. But it has
more time consumption for data classification. Different machine learning techniques were
developed in [6] for predicting dropouts based on socioeconomic data. However, the designed
technique failed to apply the larger databases.

Logistic Regression and the Input-Output Hidden Markov Model (IOHMM) were
developed in [7] for early dropout prediction. However, it failed to construct more interpretable
models of students’ behavior prediction. A collective attention network was developed in [8] for
learning the behaviors of the student to predict the dropout. However, an accurate prediction was
not performed.

The Dropout Risk validation was performed in [9] for the prediction of school students'
behavior. But the designed validation was not applied for larger samples. A simple and
interpretable procedure was introduced in [10] to predict the dropout-prone and fail-prone
students. However, it failed to include the analysis of the efficiency of academic results and
motivation.

An innovative temporal optimization approach was designed in [11] for reliable dropout
prediction. However, the time complexity analysis was not performed. Several machine learning
classifiers were developed in [12] to predict the possible dropout cases. However, it failed to
predict the student’s risk of dropping out based on similar students’ previous achievements. In
[13], a machine learning-based system was developed to enhance the dropout prediction.
However, it failed for increasing prediction model by using significant features.

Artificial neural network (ANN) algorithm is developed in [14] for predicting the school
students’ dropout. But the designed algorithm failed to achieve a more extensive and various
sample of students' data for achieving the higher prediction accuracy. An Uplift modeling
framework was designed in [15] to enhance the effectiveness of dropout prediction for higher
education institutions. But the machine learning model was not applied to improve the
performance of dropout prediction. Bayesian Neural Networks were introduced in [16] to
dropout prediction. But, network is not efficient to minimize complexity of prediction.

The Ensemble Stacking Classification method was developed in [17] to improve the
accuracy of dropout predictions. But it failed to perform the accurate prediction by increasing the
number of variations correlative feature and large dataset. End-to-end method was introduced in
[18] using CNN for predicting student dropout issue based on feature extraction and
classification. But it failed for enhancing prediction dropout with minimum time.

A supervised machine learning algorithm with the Hyperparameter Optimization was

introduced in [19] for student dropout prediction. But accuracy is not enhanced. Deep learning
algorithm is developed in [20] for building dropout prediction model. But it failed to perform the
more accurate dropout prediction models.

1.1 Contribution

 To enhance student dropout accuracy, RBOOSEFS-BKBAC technique is introduced

based on the four different processes namely preprocessing, feature extraction, feature
selection, and classification.
 The RBOOSEFS-BKBAC technique performs the preprocessing, Rank biserial
correlation-based feature extraction, and Otsuka -Ochiai Stochastic neighbor embedded
feature selection. By selected features, classification is performed for reducing student
dropout prediction time.
 A Bivariate Kernelized Regressive Bootstrap Aggregative classification is applied in the
RBOOSEFS-BKBAC technique to analyze the selected features and classifies the data.
The Bucklin voting scheme is applied to a Bootstrap Aggregative classification to
minimize the incorrect classification and improves the accuracy.
 At last, well-known experimentation is achieved for measuring RBOOSEFS-BKBAC
and existing methods. Experimental result of RBOOSEFS-BKBAC was highly effective
than existing methods.

The article is summarized by: Section 2 presents proposed RBOOSEFS-BKBAC work

design and all steps including the algorithm. Section 3 presents the experimental setup and
analysis are presented in section 4. The last section concludes the paper.

2. Methodology
Student dropout prediction helps the education sectors to identify students in appropriate
way. Accurate dropout prediction is still being a serious issue due to its impact on decreasing
higher education values and impact on the social environment. Therefore, a novel technique
called RBOOSEFS-BKBAC is introduced for improving the Student dropout prediction with
different processes. The RBOOSEFS-BKBAC technique improves the prediction performance
but also reduces time consumption.
Fig 1 provides RBOOSEFS-BKBAC for accurate dropout prediction by performing the
four different processes namely preprocessing, feature extraction, feature selection, and

classification. Initially, Dataset ‘ ’ includes a feature set and amount of

input data are collected. RBOOSEFS-BKBAC technique first performs the

data preprocessing. After that, feature extraction process was achieved using regression function.
Thirdly, feature selection is achieved for improving the classification performance and
minimizes time consumption. Finally, the classification is performed to predict the dropout.
RBOOSEFS-BKBAC was explained into subsequent subsection.

3.1 Preprocessing
Data Preprocessing in the proposed RBOOSEFS-BKBAC technique refers to the
initial steps applied to make data more suitable for classification. The preprocessing methods
include different processes namely data cleaning, normalization, and transformation. Since the
raw data collected from the real-world environment is highly vulnerable to a missing value,
include many errors causes less accuracy. Therefore, Preprocessing is an essential step for
minimizing the error and increasing the accuracy.

Consider data and amount of features‘ ’. First, the missing values of

features are handled with the mean value for those respective features in the database. Then the
data transformation is performed based on the normalization. The normalization is done based on
the feature scaling. The data normalization is performed as given below,

(1)

Where, indicates a data normalization, indicates an original feature value,

denotes a maximum feature value, denotes a minimum feature value. Finally, the

duplicate data are removed from the dataset

Algorithm 1 illustrates data preprocessing. First, amount of features as well as raw data
were composed via dataset. Then missing values are removed by normalizing the feature value.
Then the duplicate data are removed from the dataset and obtain the preprocessed data.

3.2 Rank biserial correlative feature extraction

After the data preprocessing, a proposed RBOOSEFS-BKBAC technique performs the

significant feature extraction for enhancing prediction accuracy as well as minimize prediction
time. Therefore, relevant and irrelevant features are identified for the next stage through the
correlation measure. Feature extraction was procedure of identification for reducing
dimensionality of the dataset. The proposed RBOOSEFS-BKBAC employs Rank biserial
correlation function to find correlated features.

Rank-biserial correlation coefficient was a statistical measure’ employed to identify

correlated features from the dataset. In order to detect important features, Correlation among
subset as well as features is calculated.
Consider ‘m’ amount of features ‘ ’ into big dataset
(2)

Where indicates features position , symbolizes student dataset.

By applying Rank-biserial correlation coefficient, the relationship between the features are
estimated as given below,

(3)

From (3), indicates rank-biserial, represents two features in the dataset, ‘ ’

indicates number of features. Correlation coefficient provides the binary values ‘1’ and ‘0’.
Binary value ‘1’ represents features were correlated to further processing. Binary value ‘0’
represent features are not correlated. The correlated features were accompanied input to the next
process in order to decrease the complexity. The algorithmic process of Rank biserial correlative
feature extraction is described as follows,
Algorithm 2 explains Rank biserial correlative feature extraction. The Rank biserial
correlation coefficient is applied to find the correlated features. The highly correlated features are
extracted and other features are removed. As a result, the complexity of prediction is minimized.

3.3 Otsuka-Ochiai stochastic neighbor embedded feature selection

After the Feature extraction, feature selection procedure was performed in proposed
RBOOSEFS-BKBAC technique for choosing more significant features to reduce dropout
prediction. Otsuka-Ochiai stochastic neighbor embedding technique is used for feature selection.

Let us consider the number of features Otsuka-Ochiai similar was used

to select significant features and remove the other futures. The similarity between the features
are estimated as given below,

(4)
Where ‘ ’ indicates an Otsuka-Ochiai similarity coefficient, indicates

a mutual dependence between the features , represents a squared score of ,

indicates a squared score of . Otsuka-Ochiai coefficient offers similarity value from 0 to 1. If

similarity is higher, then the feature is said to be a significant feature for prediction. Otherwise,
the features are not significant features from the dataset to minimize the complexity. The
algorithm process of Otsuka-Ochiai stochastic neighbor embedded feature selection is described
as given below,

Algorithm 1 illustrates feature selection is carried out with different processes. At

first, number of features is collected. After that, similarities between the two features are
measured. If the similarity is higher than the threshold, then the feature was important. On the
other hand, feature was unimportant. The important features are selected for data clustering.
This helps to minimize time consumption.

3.4 Bivariate Kernelized Regressive Bootstrap Aggregative classification

Lastly, classification is achieved in proposed RBOOSEFS-BKBAC technique using

Bivariate Kernelized Regressive Bootstrap Aggregative classification. Bootstrap aggregating is
developed for enhancing durability of statistical classification and regression. The ensemble
algorithm performs the accurate classification by constructing the weak learners. Weak learner
is difficult to offer perfect results. Conversely, an ensemble technique minimizes error.

Figure 2 given above illustrates an overall structural design of a Bivariate Kernelized

Regressive Bootstrap Aggregative classification technique to obtain an accurate prediction. The

ensemble technique considers training data ‘ are accompanied by input.

Ensemble learning technique uses ‘ ’ number of weak learners to classify the

given input data. The proposed technique uses bivariate regression tree for analyzing data as well
as classified into different classes. The bivariate regression tree consists of various nodes which
are indicated as: Root node attains feature analysis, every branch nodes indicates effect of
analysis, with every leaf node denotes class label. Here the bivariate represents the regression
function provides the two-class label.

Figure 3 bivariate kernelized regression tree

Figure 3 illustrates the bivariate regression tree which consists of one root node linked to
the child node through the branch node. In the root node, the analysis is performed between the
training and mean of class. The radial basis function kernel is applied to analyze the data and
mean as given below,

(5)

Where, denotes a kernel function, indicates the training data, denotes

mean of selective class. Kernel function offers similarity value among [0, 1]. Depend on
similarity value, data is intimated towards selective class and results are obtained at leaf node.

Then the weak classification results have some training errors. In order to increase the
classification accuracy, the weak learner results are summed.

(6)
From (6) ‘ ’ indicates strong classification results, denotes a weak classification

result. After combing the weak learners, the error rate is estimated among actual and predicted
classification. Error rate is formulated as given below,

(7)

Where, denotes an error rate, indicates the actual classification results,

denotes observed classification results. With utilizing Bucklin selection, then weak

learner were ranked based on error value. The results with a minimum error are ranked first than
the other. The higher ranked results are removed for further processing resulting it minimizes the
complexity and time consumption. The Bucklin voting is applied to find the majority votes of the
classified results.

(8)

Where indicates a final ensemble classification outcomes, denotes argument

of maximum utility, denotes majority vote of data whose decision is known to the

classifier. In other words, the results having majority votes are considered as final classification
results. Thus, every data were rightly restricted towards selective class by high accuracy with
lesser error. The algorithmic process of the Bivariate Kernelized Regressive Bootstrap
Aggregative Classification technique is described as given below,
Algorithm 4 illustrates Bivariate Kernelized Regressive Bootstrap Aggregative Classifier
to obtain the higher classification accuracy. The bagging ensemble classification technique
considers the number of weak learners to classify the data by using a Bivariate Kernelized
Regression tree. The root node in the regression tree analyzes the data with the mean value. The
higher similarity is categorized into a particular class. The weak learner’s results are summed
and apply the voting scheme. For each weak learner, the error rate is measured by lesser error.
Finally, the maximum votes of weak learners were obtained as final classification results.

3. Experimental Scenario

In this section, simulation of RBOOSEFS-BKBAC as well as CLSA [1] FWTS-CNN [2]

are implemented using Python. The Student-Drop-India2016 dataset
https://www.kaggle.com/imrandude/studentdropindia2016 is used for conducting the experiment.
This dataset is used for student dropout on various processes namely preprocessing, feature
extraction, feature selection, and classification. Dataset includes 15 attributes such as
continue_drop, student_id, and so on with 19100 instances. Among 15 attributes, significant
features and data are selected for prediction.

4. Performance Results Analysis

In this section, RBOOSEFS-BKBAC and CLSA [1] FWTS-CNN [2] is determined in

prediction accuracy, precision, recall, F-measure, as well as prediction time with respect
to various numbers of data.

4.1 Comparative analysis of prediction accuracy

It is estimated as proportion of amount of student data were accurately classified from

total number of student data.

(9)

Where, denotes a true positive, indicates a true negative, denotes a false

positive, denotes a false negative. Prediction accuracy is calculated in milliseconds (ms).

The experimental results of the prediction accuracy using the RBOOSEFS-BKBAC
technique over existing methods namely CLSA [1] and FWTS-CNN [2] are reported in the
above tables I. The observed performance results of the proposed RBOOSEFS-BKBAC
technique have been compared to two conventional methods with a different number of student
data. In the first iteration, 1000 student data are considered for conducting the experiment. By
applying the RBOOSEFS-BKBAC technique, the prediction accuracy is 94% whereas the
prediction accuracy of existing CLSA [1] and FWTS-CNN [2] is 90%, and 89% respectively.
RBOOSEFS-BKBAC is compared with CLSA [1] and FWTS-CNN [2]. Ten comparative
outcome represents student dropout prediction accuracy is increased by 4% using the
RBOOSEFS-BKBAC technique when compared to CLSA [1] and also increased by 6% when
compared to FWTS-CNN [2].
Figure 4 demonstrates prediction accuracy of various methods and existing approaches. As
shown in figure 4, the prediction accuracy curve is shown in the y-axis of the plot i.e. accurately
classified instances and amount of student data is specified as input to x-axis. Graphical plot
illustrates that the RBOOSEFS-BKBAC technique provides superior performance than the
conventional methods. This is due to the application of the Bivariate Kernelized Regressive
Bootstrap Aggregative classification technique. The Bucklin voting scheme is applied to improve
the higher classification accuracy. This shows the effectiveness of the RBOOSEFS-BKBAC
technique accurately detects dropout via every dropout instances towards data.

4.1 Comparative analysis of precision

Precision is measured as proportion of amount of student data were accurately classified

from entire amount of student data.

(10)
Where, denotes a true positive, denotes a false positive. Precision was

calculated by percentage (%).

Table II reports the performance analysis of the precision with respect to amount of data.
Amount of data is accompanied in ranges from 1000 to 10000. As revealed in table II, the
various results of the precision are obtained for each method. RBOOSEFS-BKBAC performs
better precision than the other conventional classification technique. This is confirmed through

statistical examination. With ‘ ’ student data are considered for experiment and the

precision of RBOOSEFS-BKBAC technique is ‘95.65 %’, and the precision of CLSA [1] and
FWTS-CNN [2] being ‘93.71%’, 93.02%. Similarly, the remaining runs are performed by
different amount of student data. Precision exhibit that the RBOOSEFS-BKBAC technique
achieves a higher precision by 2% when compared to the CLSA [1] and 3% when compared to
FWTS-CNN [2].
Figure 5 illustrates precision of student dropout prediction by various amounts of data
packets. From graph, RBOOSEFS-BKBAC provides higher precision and lesser error rate for
verifying student dropout data compared with CLSA [1] and FWTS-CNN [2]. RBOOSEFS-
BKBAC technique analyzes the student data and classifies the student data using Bivariate
Kernelized Regressive Bootstrap Aggregative classification to improve the accuracy.

4.2 Comparative analysis of Recall

It is calculated as proportion of number of true positives and false negatives. The recall is
estimated as given below,

(11)

Where, denotes a true positive, denotes a false negative. The recall is calculated by

percentage (%).
Table III and figure 6 reveals the performance results of recall versus different numbers
of data varies from 1000 to 10000. Amount of data is consumed in horizontal direction and the
performance analysis of recall is observed at the vertical axis. As revealed in figure 6, the
performances of a recall are improved using the RBOOSEFS-BKBAC technique. However, the
existing classification algorithms provide significantly lesser performances. The experiments are
conducted with ‘1000 numbers of student data involved in the recall measurement. By applying
RBOOSEFS-BKBAC, the recall rate is 97.77% and the recall rate of existing respectively.
Therefore, the percentage of recall was observed using CLSA [1] and FWTS-CNN [2] are
94.25% and 94.11% respectively. Similarly, nine various performance results are observed. The
overall comparison results inferred that the recall of data classification is considerably increased
using the RBOOSEFS-BKBAC technique by 2% and 3% compared with CLSA [1] and FWTS-
CNN [2]. The significant reason is to increase the true positive and minimize the false-negatives.

4.3 Comparative analysis of F-measure

It is measured based on precision and recall. It is formulated by,

(12)

The F-measure was calculated by percentage (%).

Table IV and figure 7 presents the performance results of the f-measure using three
methods RBOOSEFS-BKBAC technique, CLSA [1], and FWTS-CNN [2]. Among the three
techniques, the performance of the RBOOSEFS-BKBAC technique is considerably increased
than the existing methods. Let us consider the 1000 student data, the f-measure is calculated
based on precision and recall. By applying the RBOOSEFS-BKBAC technique, the obtained
value of f-measure is 96.09% whereas the performance results of f-measure are observed as
93.97% and 93.56% respectively. For different methods, ten different performances are observed
by various amounts of inputs. Average of ten comparisons indicates that the RBOOSEFS-
BKBAC technique increases the f-measure by 2% and 3% when compared to existing methods
CLSA [1] and FWTS-CNN [2].
4.5 Comparative analysis of Prediction time

Prediction time is referred by number of time consumed for predicting student dropout on
classification process. It is expressed by given below,

(13)

It is calculated by milliseconds (ms).

Table V presents the performance results of prediction time using three techniques
namely the RBOOSEFS-BKBAC technique, CLSA [1], and FWTS-CNN [2]. Among the three
techniques, the performance of the RBOOSEFS-BKBAC is lesser than CLSA [1], and FWTS-

CNN [2]. For example, when the number of data is , prediction time of RBOOSEFS-

BKBAC is whereas the prediction time of the other two existing methods CLSA [1] and

FWTS-CNN [2] are and . The observed statistical result confirms that the proposed

RBOOSEFS-BKBAC technique outperforms well than the existing methods. Similarly, various
results are observed for each method. As a result, the overall performance results of prediction
time are reduced using the RBOOSEFS-BKBAC technique by 11% compared to [1] and 18%
compared to [2] respectively.
Figure 8 demonstrates the convergence plot of prediction time of three different
algorithms namely the RBOOSEFS-BKBAC technique, CLSA [1], and FWTS-CNN [2]. As
shown in the graphical representation, the proposed RBOOSEFS-BKBAC outperforms well than
the conventional classification approaches. The significant reason behind the improvement of the
RBOOSEFS-BKBAC technique is the preprocessing, feature extraction, and feature selection.
Rank biserial correlation is applied for performing the feature extraction from the raw dataset.
Followed by, Otsuka -Ochiai Stochastic neighbor embedded feature selection is achieved for
selecting significant features for classification. With the selected features, classification is carried
out by Bivariate Kernelized Regressive Bootstrap Aggregative classification to reduce time. As a
result, prediction time of the RBOOSEFS-BKBAC technique is comparatively minimized.

5. Conclusion

The students’ dropout prediction is one of the most important performance indicators of
the educational environment. It is crucial and the reliable prediction promotes the student's
involvement in interactive activities in time. Therefore, a novel technique called RBOOSEFS-
BKBAC is developed to perform the reliable students’ dropout prediction with higher accuracy.
At first, the preprocessing is performed to transform the input dataset into readable form. Next,
feature extraction process is performed by applying Rank biserial correlation. Followed by the
significant feature selection process is carried out by applying the Otsuka -Ochiai Stochastic
neighbor embedding technique. At last, the student data classification is performed using
Bivariate Kernelized Regressive Bootstrap Aggregative classification for increasing higher
classification accuracy with minimum error. RBOOSEFS-BKBAC technique was evaluated
through in-depth experimentation and compares the results with two conventional classification
algorithms. Results have proved that the proposed RBOOSEFS-BKBAC technique increases the
performance of prediction accuracy, precision, and so on than conventional classification
techniques.

References

[1] Yafeng Zheng, Zhanghao Gao, Yihang Wang, Qian Fu, “MOOC Dropout Prediction Using
FWTS-CNN Model Based on Fused Feature Weighting and Time Series”, IEEE Access,
Volume 8, 2020, Pages 225324 – 225335

[2] Yafeng Zheng, Zhanghao Gao, Yihang Wang, Qian Fu, “MOOC Dropout Prediction Using
FWTS-CNN Model Based on Fused Feature Weighting and Time Series”, IEEE Access,
Volume 8, 2020, Pages 225324 – 225335

[3] Pedro Manuel Moreno-Marcos, Pedro J. Muñoz-Merino, Jorge Maldonado-Mahauad, Mar

Pérez-Sanagustín, Carlos Alario-Hoyos, Carlos Delgado Kloos, “Temporal analysis for dropout
prediction using self-regulated learning strategies in self-paced MOOCs”, Computers &
Education, Elsevier, Volume 145, 2020, Pages 1-41

[4] Ahmed A. Mubarak, Han Cao, Ibrahim M. Hezam, “Deep analytic model for student dropout
prediction in massive open online courses”, Computers and Electrical Engineering, Volume 93,
2021, Pages 1-14

[5] Jing Chen, Jun Feng, Xia Sun, Nannan Wu, Zhengzheng Yang and Sushing Chen, “MOOC
Dropout Prediction Using a Hybrid Algorithm Based on Decision Tree and Extreme Learning
Machine”, Mathematical Problems in Engineering, Hindawi, Volume 2019, March 2019, Pages
1-11

[6] Francisco A. da S. Freitas, Francisco F. X. Vasconcelos, Solon A. Peixoto, Mohammad

Mehedi Hassan, M. Ali Akber Dewan and Victor Hugo C. de Albuquerque and Pedro P.
Rebouças Filho, “IoT System for School Dropout Prediction Using Machine Learning
Techniques Based on Socioeconomic Data”, Electronics, Volume 9, 2020, Pages 1-14

[7] Ahmed A. Mubarak, Han Cao & Weizhen Zhang, “Prediction of students’ early dropout
based on their interaction logs in online learning environment”, Interactive Learning
Environments, 2020, Pages 1-21

[8] Jingjing Zhang, Ming Gao, Jiang Zhang, “The learning behaviours of dropouts in MOOCs: A
collective attention network perspective”, Computers & Education, Elsevier, Volume 167, 2021,
Pages 1-15

[9] Michael G. Vaughn, Greg Roberts, Anna-Maria Fall, Kristen Kremer, Leticia Martinez,
“Preliminary validation of the dropout risk inventory for middle and high school students”,
Children and Youth Services Review, Volume 111, 2020, Pages 1-8

[10] Josep Figueroa-Cañas and Teresa Sancho-Vinuesa, “Early prediction of dropout and final
exam performance in an online statistics course”, IEEE Revista Iberoamericana de Tecnologias
del Aprendizaje , Volume 15, Issue 2, 2020, Pages 86 - 94

[11] Fernando Jiménez, Alessia Paoletti, Gracia Sánchez, Guido Sciavicco, “Predicting the Risk
of Academic Dropout with Temporal Multi-Objective Optimization”, IEEE Transactions on
Learning Technologies, Volume 12, Issue 2, 2019, Pages 225 – 236

[12] Janka Kabathova and Martin Drlik, “Towards Predicting Student’s Dropout in University
Courses Using Different Machine Learning Techniques”, Applied Science, Volume 11, 2021,
Pages 1-19

[13] Sunbok Lee and Jae Young Chung, “The Machine Learning-Based Dropout Early Warning
System for Improving the Performance of Dropout Prediction”, Applied Science, Volume 9,
2019, 9 Pages 1-14

[14] Ali Çetinkaya and Ömer Kaan Baykan, “Prediction of middle school students’ programming
talent using artificial neural networks”, Engineering Science and Technology, an International
Journal, Elsevier, Volume 23, Issue 6, 2020, Pages 1301-1307
[15] Diego Olaya, Jonathan Vásquez, Sebastián Maldonado, Jaime Miranda, Wouter Verbeke,
“Uplift Modeling for preventing student dropout in higher education”, Decision Support
Systems, Elsevier, Volume 134, 2020, Pages 1-38

[16] Yuki Mae, Wataru Kumagai, Takafumi Kanamori, “Uncertainty propagation for dropout-
based Bayesian neural networks”, Neural Networks, Elsevier, Volume 144, 2021, Pages 394-406

[17] Nindhia Hutagaol and Suharjito, “Predictive Modelling of Using Ensemble Classifier
Method in Higher Education”, Predictive Modelling of Student Dropout Using Ensemble
Classifier Method in Higher Education, Volume 4, Issue 4, 2019, Pages 206-211

[18] Lin Qiu, Yanshen Liu, Quan Hu & Yi Liu, “Student dropout prediction in massive open
online courses by convolutional neural networks”, Soft Computing, Springer, Volume 23, 2019,
Pages 10287–10301

[19] Theodor Panagiotakopoulos, Sotiris Kotsiantis, Georgios Kostopoulos, Omiros Iatrellis, and
Achilles Kameas, “Early Dropout Prediction in MOOCs through Supervised Learning and
Hyperparameter Optimization”, Electronics, Volume 10, 2021, Pages 1-11

[20] Wanli Xing and Dongping Du, “Dropout Prediction in MOOCs: Using Deep Learning for
Personalized Intervention”, Journal of Educational Computing Research, Volume 57, Issue 3,
2019, Pages 1-24

Student Dropout Prediction
No ratings yet
Student Dropout Prediction
11 pages
A Hybrid Model Integrating Recurrent Neural Networks and The Semi-Supervised Support Vector Machine For Identification of Early Student Dropout Risk
No ratings yet
A Hybrid Model Integrating Recurrent Neural Networks and The Semi-Supervised Support Vector Machine For Identification of Early Student Dropout Risk
31 pages
(Colpo Et Al, 2024) Educational Data Mining For Drpout Prediction - Trends, Opportunities and Challenges
No ratings yet
(Colpo Et Al, 2024) Educational Data Mining For Drpout Prediction - Trends, Opportunities and Challenges
37 pages
A Model For Predicting Dropout of Higher Education Students
No ratings yet
A Model For Predicting Dropout of Higher Education Students
36 pages
Predicting Online Education Dropout: A New Machine Learning Model Based On Sentiment Analysis, Socio-Demographic, and Behavioral Data
No ratings yet
Predicting Online Education Dropout: A New Machine Learning Model Based On Sentiment Analysis, Socio-Demographic, and Behavioral Data
27 pages
Mathematics 10 03359 v2
No ratings yet
Mathematics 10 03359 v2
20 pages
Demographic Predictors
No ratings yet
Demographic Predictors
32 pages
18 +id4695+ (1319-1338)
No ratings yet
18 +id4695+ (1319-1338)
20 pages
PredictingStudentSuccess-AutoML PrePrint
No ratings yet
PredictingStudentSuccess-AutoML PrePrint
23 pages
08 Ingles 2020 Revista 6 Importante Sisisis
No ratings yet
08 Ingles 2020 Revista 6 Importante Sisisis
21 pages
Data Mining in Higher Education: University Student Dropout Case Study
No ratings yet
Data Mining in Higher Education: University Student Dropout Case Study
13 pages
CNN Autoencoders and LSTM-based Reduced Order Mode
No ratings yet
CNN Autoencoders and LSTM-based Reduced Order Mode
18 pages
Latent Variable Path Modeling With PLS-Verlag Heidelberg
No ratings yet
Latent Variable Path Modeling With PLS-Verlag Heidelberg
284 pages
Data-Driven System To Predict Academic Grades and Dropout: A1111111111 A1111111111 A1111111111 A1111111111 A1111111111
No ratings yet
Data-Driven System To Predict Academic Grades and Dropout: A1111111111 A1111111111 A1111111111 A1111111111 A1111111111
21 pages
Student Dropout para Format
No ratings yet
Student Dropout para Format
17 pages
Deep FM-Based Predictive Model For Student Dropout in Online Classes
No ratings yet
Deep FM-Based Predictive Model For Student Dropout in Online Classes
17 pages
A Machine Learning Approach For Tracking and Predicting Student Performance in Degree Programs
No ratings yet
A Machine Learning Approach For Tracking and Predicting Student Performance in Degree Programs
34 pages
Structural Neural Networks Meet Piecewise Exponential Models For Interpretable College Dropout Prediction
No ratings yet
Structural Neural Networks Meet Piecewise Exponential Models For Interpretable College Dropout Prediction
24 pages
Flores Et Al., (2022)
No ratings yet
Flores Et Al., (2022)
16 pages
Project Synopsis
No ratings yet
Project Synopsis
16 pages
Neema Mduma - Machine Learning Approach For Reducing
No ratings yet
Neema Mduma - Machine Learning Approach For Reducing
15 pages
Computers and Electrical Engineering
No ratings yet
Computers and Electrical Engineering
16 pages
2023 04 06 - Data 08 00049
No ratings yet
2023 04 06 - Data 08 00049
15 pages
Applsci 09 03093
No ratings yet
Applsci 09 03093
14 pages
4.-Student Dropout Prediction 2020
No ratings yet
4.-Student Dropout Prediction 2020
12 pages
Two Layer ML Student DO
No ratings yet
Two Layer ML Student DO
12 pages
An Integrated System Framework For Predicting Students' Academic Performance in Higher Educational Institutions
No ratings yet
An Integrated System Framework For Predicting Students' Academic Performance in Higher Educational Institutions
9 pages
A Predictive Model For The Early Identification of Student Dropout Using Data Classification Clustering and Association Methods
No ratings yet
A Predictive Model For The Early Identification of Student Dropout Using Data Classification Clustering and Association Methods
10 pages
1 s2.0 S1877050921025266 Main
No ratings yet
1 s2.0 S1877050921025266 Main
6 pages
Data Mining
No ratings yet
Data Mining
11 pages
A CRITICAL - PDF (TURKEY)
No ratings yet
A CRITICAL - PDF (TURKEY)
10 pages
Shsconf Glob2021 09001
No ratings yet
Shsconf Glob2021 09001
10 pages
Survival Analysis Based Framework For Early Prediction of Student Dropouts
No ratings yet
Survival Analysis Based Framework For Early Prediction of Student Dropouts
10 pages
High School Major Classification Towards University Students Variable of Score Using Naïve Bayes Algorithm
No ratings yet
High School Major Classification Towards University Students Variable of Score Using Naïve Bayes Algorithm
8 pages
Evaluating Strategies Student Dropout
No ratings yet
Evaluating Strategies Student Dropout
8 pages
Research Artical - ML Model To Improve Learning Process and Reduce Dropout 2
No ratings yet
Research Artical - ML Model To Improve Learning Process and Reduce Dropout 2
9 pages
Predicting Student Dropout Risk
No ratings yet
Predicting Student Dropout Risk
12 pages
Sagnikshaw03@gmail - Com Predicting Student Dropout Risk
No ratings yet
Sagnikshaw03@gmail - Com Predicting Student Dropout Risk
13 pages
E 876 VSB 13 Y
No ratings yet
E 876 VSB 13 Y
11 pages
BKH2222MBA128F Slide PDF
No ratings yet
BKH2222MBA128F Slide PDF
8 pages
Predicting Student Dropout Risk
No ratings yet
Predicting Student Dropout Risk
11 pages
Hanesh2008@Gmail - Com Predicting Student Dropout Risk
No ratings yet
Hanesh2008@Gmail - Com Predicting Student Dropout Risk
10 pages
A Naïve Bayes Students' Performance Prediction Model For Decision Support System
No ratings yet
A Naïve Bayes Students' Performance Prediction Model For Decision Support System
9 pages
Interpretable Prediction of Student Dropout Using Explainable AI Models Paper 5
No ratings yet
Interpretable Prediction of Student Dropout Using Explainable AI Models Paper 5
7 pages
Predicting Student Performance Using Feature Selection Algorithms For Deep Learning Models
No ratings yet
Predicting Student Performance Using Feature Selection Algorithms For Deep Learning Models
7 pages
10 Forecasting Student Attrition Using Machine Learning
No ratings yet
10 Forecasting Student Attrition Using Machine Learning
7 pages
Styled Predicting Student Dropout Risk Presentation
No ratings yet
Styled Predicting Student Dropout Risk Presentation
9 pages
Predicting Student Dropout Risk Presentation
No ratings yet
Predicting Student Dropout Risk Presentation
9 pages
Pradyumnkabrayt@Gmail - Com Predicting Student Dropout Risk
No ratings yet
Pradyumnkabrayt@Gmail - Com Predicting Student Dropout Risk
9 pages
Dropout Prediction by Interpretable Machine Learni
No ratings yet
Dropout Prediction by Interpretable Machine Learni
6 pages
A Belief Rule Based Expert System To Predict Student Performance Under Uncertainty
No ratings yet
A Belief Rule Based Expert System To Predict Student Performance Under Uncertainty
6 pages
Student Dropout Prediction in MOOC Using Machine Learning Algorithms
No ratings yet
Student Dropout Prediction in MOOC Using Machine Learning Algorithms
6 pages
Sri Vidya Mandir Arts & Science College: (Autonomous)
No ratings yet
Sri Vidya Mandir Arts & Science College: (Autonomous)
69 pages
Ishita Marwah Takehome Project IITM
No ratings yet
Ishita Marwah Takehome Project IITM
10 pages
Dhanish Predicting Student Dropout Risk
No ratings yet
Dhanish Predicting Student Dropout Risk
9 pages
Performance and Early Drop Prediction For Higher Education Students Using Machine Learning
No ratings yet
Performance and Early Drop Prediction For Higher Education Students Using Machine Learning
9 pages
Active-B12 - Atellica IM - Rev 03 DXDCM 09017fe9807be75c-1676071034780
No ratings yet
Active-B12 - Atellica IM - Rev 03 DXDCM 09017fe9807be75c-1676071034780
18 pages
Problem Identification Format-1
No ratings yet
Problem Identification Format-1
4 pages
Fanuel Bekele
No ratings yet
Fanuel Bekele
79 pages
Review Paper Student Dropout Rate Prediction GRP 16
No ratings yet
Review Paper Student Dropout Rate Prediction GRP 16
4 pages
Irjet V7i2688 PDF
No ratings yet
Irjet V7i2688 PDF
4 pages
11.B.sc. Statistics
No ratings yet
11.B.sc. Statistics
59 pages
Aryan Chugh BRM Lab File
No ratings yet
Aryan Chugh BRM Lab File
68 pages
Introduction - Report
No ratings yet
Introduction - Report
2 pages
Research Methods
No ratings yet
Research Methods
62 pages
The Project of Dividend Policy Commercial Bank Limited
No ratings yet
The Project of Dividend Policy Commercial Bank Limited
45 pages
StatsLab 1
No ratings yet
StatsLab 1
19 pages
Teachers' Preparedness, Experiences and Challenges On The Implementation of Catch-Up Fridays A Basis For Improvement
No ratings yet
Teachers' Preparedness, Experiences and Challenges On The Implementation of Catch-Up Fridays A Basis For Improvement
13 pages
SMA - Assignment Description - Vi Tran
No ratings yet
SMA - Assignment Description - Vi Tran
11 pages
The Alliance in Child and Adolescent Psychotherapy
No ratings yet
The Alliance in Child and Adolescent Psychotherapy
8 pages
P1 - Management Accounting - Performance Evaluation
No ratings yet
P1 - Management Accounting - Performance Evaluation
24 pages
Transport Layer Security (TLS)
No ratings yet
Transport Layer Security (TLS)
14 pages
Sample Project 1
No ratings yet
Sample Project 1
16 pages
Bivariate Data Notes
No ratings yet
Bivariate Data Notes
14 pages
Transport Layer Security (TLS)
No ratings yet
Transport Layer Security (TLS)
14 pages
1 - Assignment 01 570
No ratings yet
1 - Assignment 01 570
22 pages
Framework, Web Form Fundamentals - Web Control
No ratings yet
Framework, Web Form Fundamentals - Web Control
9 pages
MAT240 Project One
No ratings yet
MAT240 Project One
14 pages
Bacharach, 1989
No ratings yet
Bacharach, 1989
21 pages
Activity 5. Linear Regression and Correlation 2 Nicole Tayaua
No ratings yet
Activity 5. Linear Regression and Correlation 2 Nicole Tayaua
4 pages
An Investigation of The Relationship Between High-School Students' Problematic Mobile Phone Use and Their Self-Esteem Levels
No ratings yet
An Investigation of The Relationship Between High-School Students' Problematic Mobile Phone Use and Their Self-Esteem Levels
7 pages
Comparison of Malaysian and SAE J833 Anthropometri
No ratings yet
Comparison of Malaysian and SAE J833 Anthropometri
11 pages
A Study On Banking Sector of Northern Region of Sri Lanka
No ratings yet
A Study On Banking Sector of Northern Region of Sri Lanka
11 pages
227E3B
No ratings yet
227E3B
1 page
Analisis Faktor-Faktor Yang Mempengaruhi Pendapatan Nelayan
No ratings yet
Analisis Faktor-Faktor Yang Mempengaruhi Pendapatan Nelayan
13 pages
HC Alaba 2019
No ratings yet
HC Alaba 2019
11 pages
The Relationships Among Uncertainty, Social Support, and Psychological Distress in Adolescents Recently Diagnosed With Cancer
No ratings yet
The Relationships Among Uncertainty, Social Support, and Psychological Distress in Adolescents Recently Diagnosed With Cancer
10 pages
RRL For New Var
No ratings yet
RRL For New Var
12 pages
IJSC Author's Information Form 1939
No ratings yet
IJSC Author's Information Form 1939
2 pages
Question
No ratings yet
Question
5 pages
Critical Thinkingand Reasing Comprehensionamong Postgraduate Students The Caseof Genderand Language Proficiency Level Yousefi Mohammadi 2016
No ratings yet
Critical Thinkingand Reasing Comprehensionamong Postgraduate Students The Caseof Genderand Language Proficiency Level Yousefi Mohammadi 2016
7 pages
Business Analytics
No ratings yet
Business Analytics
3 pages
Constraining Nuclear Matter Parameters and Neutron Star Observables Using PREX-2 and NICER Data
No ratings yet
Constraining Nuclear Matter Parameters and Neutron Star Observables Using PREX-2 and NICER Data
5 pages
EDUCATION DATA MINING FOR PREDICTING STUDENTS’ PERFORMANCE
From Everand
EDUCATION DATA MINING FOR PREDICTING STUDENTS’ PERFORMANCE
Dr. GEETHA N DATA SCIENTIST, BENGALURU
No ratings yet
Apes-Usa : Academic Performance Evaluation of Students - Ubiquitous System Analyzed: Letter Grading System Is Inherently Unfair by Its Very Design and Requires a Complete Re-Design the Problem Is Not Grade Inflation
From Everand
Apes-Usa : Academic Performance Evaluation of Students - Ubiquitous System Analyzed: Letter Grading System Is Inherently Unfair by Its Very Design and Requires a Complete Re-Design the Problem Is Not Grade Inflation
Prof. Keshava Prasad Halemane Ph.D.
No ratings yet

Rajagopal - Phase 2 - Journal Plagiarism Fully Completed

Uploaded by

Rajagopal - Phase 2 - Journal Plagiarism Fully Completed

Uploaded by

Rank biserial Stochastic Feature Embed Bivariate Kernelized Regressive Bootstrap

Aggregative Classifier for School Student Dropout Prediction

C.Rajagopal1, Dr.T. Balasubramanian2

Keywords: School Student Dropout Prediction, preprocessing, Rank biserial correlation-

A supervised machine learning algorithm with the Hyperparameter Optimization was

 To enhance student dropout accuracy, RBOOSEFS-BKBAC technique is introduced

The article is summarized by: Section 2 presents proposed RBOOSEFS-BKBAC work

classification. Initially, Dataset ‘ ’ includes a feature set and amount of

input data are collected. RBOOSEFS-BKBAC technique first performs the

Consider data and amount of features‘ ’. First, the missing values of

Where, indicates a data normalization, indicates an original feature value,

duplicate data are removed from the dataset

3.2 Rank biserial correlative feature extraction

After the data preprocessing, a proposed RBOOSEFS-BKBAC technique performs the

Rank-biserial correlation coefficient was a statistical measure’ employed to identify

Where indicates features position , symbolizes student dataset.

From (3), indicates rank-biserial, represents two features in the dataset, ‘ ’

3.3 Otsuka-Ochiai stochastic neighbor embedded feature selection

Let us consider the number of features Otsuka-Ochiai similar was used

a mutual dependence between the features , represents a squared score of ,

indicates a squared score of . Otsuka-Ochiai coefficient offers similarity value from 0 to 1. If

Algorithm 1 illustrates feature selection is carried out with different processes. At

3.4 Bivariate Kernelized Regressive Bootstrap Aggregative classification

Lastly, classification is achieved in proposed RBOOSEFS-BKBAC technique using

Figure 2 given above illustrates an overall structural design of a Bivariate Kernelized

ensemble technique considers training data ‘ are accompanied by input.

Ensemble learning technique uses ‘ ’ number of weak learners to classify the

Figure 3 bivariate kernelized regression tree

Where, denotes a kernel function, indicates the training data, denotes

Where, denotes an error rate, indicates the actual classification results,

Where indicates a final ensemble classification outcomes, denotes argument

In this section, simulation of RBOOSEFS-BKBAC as well as CLSA [1] FWTS-CNN [2]

4. Performance Results Analysis

In this section, RBOOSEFS-BKBAC and CLSA [1] FWTS-CNN [2] is determined in

4.1 Comparative analysis of prediction accuracy

It is estimated as proportion of amount of student data were accurately classified from

Where, denotes a true positive, indicates a true negative, denotes a false

positive, denotes a false negative. Prediction accuracy is calculated in milliseconds (ms).

4.1 Comparative analysis of precision

Precision is measured as proportion of amount of student data were accurately classified

calculated by percentage (%).

4.2 Comparative analysis of Recall

4.3 Comparative analysis of F-measure

It is measured based on precision and recall. It is formulated by,

The F-measure was calculated by percentage (%).

It is calculated by milliseconds (ms).

[3] Pedro Manuel Moreno-Marcos, Pedro J. Muñoz-Merino, Jorge Maldonado-Mahauad, Mar

[6] Francisco A. da S. Freitas, Francisco F. X. Vasconcelos, Solon A. Peixoto, Mohammad

You might also like