Best Practices for Predicting Student Success

The document reviews the use of Education Data Mining (EDM) and machine learning tools for predicting academic success in higher education, highlighting the importance of data preparation and modeling techniques. It discusses various factors influencing student success, including demographics and prior academic achievement, and emphasizes the need for effective data handling strategies to improve prediction accuracy. The study provides guidelines for implementing EDM techniques, which can be adapted for both undergraduate and graduate levels.

Uploaded by

shamitnibras9

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

69 views3 pages

Best Practices for Predicting Student Success

Uploaded by

shamitnibras9

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

Predicting Academic Success in Higher Education

Literature Review and Best Practices

 Data mining has a stack of open source tools such as machine learning tools which
supports the researcher in analyzing the dataset using several algorithms.
 Such tools are vastly used for predictive analysis, visualization, and statistical modeling.
 WEKA is the most used tool for predictive modeling (Jayaprakash, 2018).
 This can be explained by its many pre-built tools for data pre-processing, classification,
association rules, regression, and visualization, as well as its user-friendliness, and
accessibility even to a novice in programming or data mining.
Education Data Mining (EDM) plays a significant role in discovering patterns of knowledge
about educational phenomena and the learning process, including understanding performance.
EDM has been used for predicting a variety of crucial educational outcomes, like performance,
retention, success, satisfaction, achievement, and dropout rate. Student success is a crucial
component of higher education institutions because it is considered as an essential criterion for
assessing the quality of educational institutions. Despite reports calling for more detailed views
of the term, the bulk of published researchers measure academic success narrowly as academic
achievement. There are several definitions of student success in the literature.
Academic achievement itself is mainly based on grades and GPA, or Cumulative Grade Point
Average. Academic success has also been defined related to students' persistence, also called
academic resilience. Several studies have been published in using data mining methods to predict
students' academic success.
Prior academic achievement, student demographics, e-learning activity, psychological attributes,
and environments are the most commonly reported factors. Gender, age, race/ethnicity,
socioeconomic status and father's and mother's background have been shown to be important.
The psychological attributes are determined as the interests and personal behavior of the student.
Several studies indicated its impact on students' success. The design of a prediction model using
data mining techniques requires the instantiation of many characteristics, like the type of the
model to build, or methods and techniques to apply.
This section defines these attributes, provide some of their instances, and reveal the statistics of
their occurrence among the reviewed papers grouped by the target variable in the student success
prediction. Predicting success at a course level can give more accuracy than at degree or year
level. The best accuracy is obtained in course level with 93%. The target course was an advanced
programming course while the influential factor was a previous programming course, also a
prerequisite course. All decisions needed to be taken at various stages are explained, along with a
shortlist of best practices collected from the literature.
Data sources tend to be inconsistent, contain noises, and usually suffer from missing values. This
is why the raw data needs to go through an initial preparation, consisting of 1) selection, 2)
cleaning, and 3) derivation of new variables. Data selection, also called "Dimensionality
Reduction", consists in vertical selection and horizontal selection.
Data sources tend to be inconsistent, contain noises, and usually suffer from missing data. There
are two strategies to deal with missing data: listwise deletion or imputation. Outliers data are also
known as anomalies, can easily be identified by visual means. Once identified, outliers can be
removed from the modeling data. New variables can be derived from existing variables by
combining them.
For example, GPA is a common variable that can be obtained from SIS system. Preliminary
statistical analysis, especially through visualization, allows to better understand the data before
moving to more sophisticated data mining tasks and algorithms. Dedicated tools like
STATISTICA and SPSS can also provide tremendous insight.
Data transformation is a necessary step to eliminate dissimilarities in the dataset. Normalizing
the data may improve the accuracy and efficiency of the mining algorithms. Discretization also
increases the accuracy of the models by overcoming noisy data, and by identifying outliers'
values. Finally, discrete features are easier to understand, handle, and explain. It is common in
EDM applications that the dataset is imbalanced, meaning that the number of samples from one
class is significantly less than the samples from other classes.
Re-sampling is the solution of choice. Feature selection aims to choose a subset of attributes
from the input data while reducing effects from unrelated variables while preserving sufficient
prediction results. Two types of data mining models are commonly used in EDM applications for
success prediction. Descriptive models are used to produce patterns that describe the
fundamental structure, relations, and interconnectedness of the mined data. Predictive models
apply supervised learning functions to provide estimation for expected values of dependent
variables. Table 13 shows the recurrence of specific algorithms based on the literature review
that we performed.
Data mining has a stack of open source tools such as machine learning tools which supports the
researcher in analyzing the dataset using several algorithms. There are various strategies to tune
parameters for EDM algorithms, used to find the most useful performing parameters. Different
performance measures are included to evaluate the model of each classifier, almost all measures
of performance are based on the confusion matrix and numbers in it. By applying EDM
techniques, it is possible to develop prediction models to improve student success. Using data
mining techniques can be daunting and challenging for nontechnical persons.
This study presents a clear set of guidelines to follow for using EDM for success prediction. The
study was limited to undergraduate level, however the same principles can be easily adapted to
graduate level.
Education Data Mining (EDM) plays a significant role in discovering patterns of knowledge
about educational phenomena and the learning process, including understanding performance.
EDM has been used for predicting a variety of crucial educational outcomes, like performance,
retention, success, satisfaction, achievement, and dropout rate. Several studies have been
published in using data mining methods to predict students' academic success. Data sources tend
to be inconsistent, contain noises, and usually suffer from missing data. There are two strategies
to deal with missing data: listwise deletion or imputation. Normalizing the data may improve the
accuracy and efficiency of the mining algorithms. Discretization increases the accuracy of the
models by overcoming noisy data, and by identifying outliers' values. Data mining has a stack of
open source tools such as machine learning tools which supports the researcher in analyzing the
dataset using several algorithms. Descriptive models are used to produce patterns that describe
the fundamental structure, relations, and interconnectedness of the mined data. Predictive models
apply supervised learning functions to provide estimation for expected values of dependent
variables.

Data Mining Approach To Predict Academic Performance of Students
No ratings yet
Data Mining Approach To Predict Academic Performance of Students
11 pages
Review and Comparison of Various Technologies For Predicting Students' Academic Performance
No ratings yet
Review and Comparison of Various Technologies For Predicting Students' Academic Performance
8 pages
Student Performance Prediction Using Machine Learn
No ratings yet
Student Performance Prediction Using Machine Learn
8 pages
Predicting Student Performance To
No ratings yet
Predicting Student Performance To
17 pages
Ijesrt: International Journal of Engineering Sciences & Research Technology
No ratings yet
Ijesrt: International Journal of Engineering Sciences & Research Technology
11 pages
Yash 21BSDS12 Perdictive Analysis Report
No ratings yet
Yash 21BSDS12 Perdictive Analysis Report
20 pages
Chapter Two
No ratings yet
Chapter Two
7 pages
Predicting Student Academic Success DDA
No ratings yet
Predicting Student Academic Success DDA
26 pages
Educational Data Mining Insights
No ratings yet
Educational Data Mining Insights
6 pages
Role of Data Mining in Education For Improving Students Performance For Social Change
No ratings yet
Role of Data Mining in Education For Improving Students Performance For Social Change
2 pages
Educational Data Mining For Predicting Studentsâ ™ Academic Performance Using Machine Learning Algorithms
No ratings yet
Educational Data Mining For Predicting Studentsâ ™ Academic Performance Using Machine Learning Algorithms
8 pages
Article 4
No ratings yet
Article 4
9 pages
Educational Data Mining: A Review and Analysis of Student's Academic Performance
No ratings yet
Educational Data Mining: A Review and Analysis of Student's Academic Performance
15 pages
A Survey On Educational Data Mining Techniques in Predicting Student's Academic Performance
No ratings yet
A Survey On Educational Data Mining Techniques in Predicting Student's Academic Performance
3 pages
Irjet V7i2688 PDF
No ratings yet
Irjet V7i2688 PDF
4 pages
Predicting Student Performance with EDM
No ratings yet
Predicting Student Performance with EDM
5 pages
Data Mining for Predicting Student Success
No ratings yet
Data Mining for Predicting Student Success
12 pages
Badr 2016
No ratings yet
Badr 2016
10 pages
Pattern
No ratings yet
Pattern
14 pages
Predicting Academic Outcomes - A Survey From 2007 Till 2018
No ratings yet
Predicting Academic Outcomes - A Survey From 2007 Till 2018
33 pages
SSRN Id3243704
No ratings yet
SSRN Id3243704
6 pages
Final Survey Paper 17-9-13
No ratings yet
Final Survey Paper 17-9-13
5 pages
2023-Contextualizing The Current State of Research On The Use Ofmachine Learning For Student Performance Prediction Asystematic Literature Review
No ratings yet
2023-Contextualizing The Current State of Research On The Use Ofmachine Learning For Student Performance Prediction Asystematic Literature Review
25 pages
Data Mining Techniques in Education
No ratings yet
Data Mining Techniques in Education
19 pages
PDL Sem 3
No ratings yet
PDL Sem 3
36 pages
Analysis of Educational
No ratings yet
Analysis of Educational
5 pages
Analyzing Undergraduate Students' Performance Using Educational Data Mining
No ratings yet
Analyzing Undergraduate Students' Performance Using Educational Data Mining
18 pages
Educational Data Mining Insights
No ratings yet
Educational Data Mining Insights
9 pages
Regression Analysis of Student Academic Performance Using Deep Learning
No ratings yet
Regression Analysis of Student Academic Performance Using Deep Learning
16 pages
Ijertv13n10 46withibthal-0.5
No ratings yet
Ijertv13n10 46withibthal-0.5
15 pages
Feature Selection for Student Performance
No ratings yet
Feature Selection for Student Performance
10 pages
Kamal 2018
No ratings yet
Kamal 2018
9 pages
PM Web 18058
No ratings yet
PM Web 18058
18 pages
Novel Approach To Evaluate Student Performance Using Data Mining
No ratings yet
Novel Approach To Evaluate Student Performance Using Data Mining
6 pages
1.student Performance Prediction Techniques
No ratings yet
1.student Performance Prediction Techniques
5 pages
(Fa) Fianl Research Paper Data Mining..
No ratings yet
(Fa) Fianl Research Paper Data Mining..
59 pages
2950-Article Text-5557-1-10-20210418
No ratings yet
2950-Article Text-5557-1-10-20210418
6 pages
Final Paper
No ratings yet
Final Paper
8 pages
Using Educational Data Mining To Predict Students
No ratings yet
Using Educational Data Mining To Predict Students
17 pages
Inter 1
No ratings yet
Inter 1
25 pages
A Decision Tree Approach For Predicting Students Academic Performance
No ratings yet
A Decision Tree Approach For Predicting Students Academic Performance
8 pages
Presentation For Follow Up
No ratings yet
Presentation For Follow Up
23 pages
AbuSaa2019 Article FactorsAffectingStudentsPerfor
No ratings yet
AbuSaa2019 Article FactorsAffectingStudentsPerfor
32 pages
Educational Data Mining: Student Performance Prediction in Academic
No ratings yet
Educational Data Mining: Student Performance Prediction in Academic
7 pages
Predicting Student Academic Performance Using Data Mining Methods
No ratings yet
Predicting Student Academic Performance Using Data Mining Methods
5 pages
Educational Data Mining Techniques Approach To Predict Student's Performance
No ratings yet
Educational Data Mining Techniques Approach To Predict Student's Performance
4 pages
14 Predicting Students Performance in Educational Data Mining
No ratings yet
14 Predicting Students Performance in Educational Data Mining
4 pages
Factors Affecting Students Performance I
No ratings yet
Factors Affecting Students Performance I
32 pages
Abu A - Factors Affecting Students Performance in Higher
No ratings yet
Abu A - Factors Affecting Students Performance in Higher
33 pages
ICSMB2016-C Anuradha
No ratings yet
ICSMB2016-C Anuradha
7 pages
Preprocessing and Analyzing Educational Data Set Using X-API For Improving Student's Performance
No ratings yet
Preprocessing and Analyzing Educational Data Set Using X-API For Improving Student's Performance
5 pages
Machine Learning Approaches For Student Performance Prediction
No ratings yet
Machine Learning Approaches For Student Performance Prediction
6 pages
Predicting School Performance with Data Mining
No ratings yet
Predicting School Performance with Data Mining
9 pages
Early Student Performance Prediction
No ratings yet
Early Student Performance Prediction
12 pages
Pad Project Research Paper
No ratings yet
Pad Project Research Paper
15 pages
Data Mining Applications: A Comparative Study For Predicting Student's Performance
No ratings yet
Data Mining Applications: A Comparative Study For Predicting Student's Performance
7 pages
PCB4023+Cell+Bio+Lab Cell+Culture+and+Cell+counting
No ratings yet
PCB4023+Cell+Bio+Lab Cell+Culture+and+Cell+counting
9 pages
Errors That Can Occur When You Run A Report From Tigerpaw
No ratings yet
Errors That Can Occur When You Run A Report From Tigerpaw
22 pages
EC8094-SATELLITE COMMUNICATION-1814478256-Satellite Communication QB
No ratings yet
EC8094-SATELLITE COMMUNICATION-1814478256-Satellite Communication QB
27 pages
OKI MAnual
No ratings yet
OKI MAnual
1,267 pages
Iia-4. Permutations and Combinations
100% (2)
Iia-4. Permutations and Combinations
5 pages
FortiManager CLI Command Guide
No ratings yet
FortiManager CLI Command Guide
7 pages
Class 10 Mathematics 2024-25 Question Bank
No ratings yet
Class 10 Mathematics 2024-25 Question Bank
24 pages
Angulated Views in Coronary Angiography
No ratings yet
Angulated Views in Coronary Angiography
26 pages
Convert Spool to PDF and Email
No ratings yet
Convert Spool to PDF and Email
7 pages
5th Grade Large Numbers Worksheet
No ratings yet
5th Grade Large Numbers Worksheet
2 pages
Rounding and Error Analysis in Numerical Methods
No ratings yet
Rounding and Error Analysis in Numerical Methods
45 pages
CSE 425 Summer Term2021 Assessment Final
No ratings yet
CSE 425 Summer Term2021 Assessment Final
3 pages
Hazen-Williams Head Loss Data for HDPE
No ratings yet
Hazen-Williams Head Loss Data for HDPE
2 pages
MCA-104 Unit-I-INFORMATION TECHNOLOGY NOTES
75% (8)
MCA-104 Unit-I-INFORMATION TECHNOLOGY NOTES
73 pages
Manual VCDS - Audi Q3
50% (2)
Manual VCDS - Audi Q3
4 pages
DNN NeuroSim V2.1 User Manual
No ratings yet
DNN NeuroSim V2.1 User Manual
34 pages
CIPW Norm Calculation Guide
No ratings yet
CIPW Norm Calculation Guide
5 pages
Grade 6 - FA1 - AY 2025-26
No ratings yet
Grade 6 - FA1 - AY 2025-26
4 pages
Trigonometry 1
No ratings yet
Trigonometry 1
5 pages
CS675: Convex and Combinatorial Optimization Fall 2019 Convex Optimization Problems
No ratings yet
CS675: Convex and Combinatorial Optimization Fall 2019 Convex Optimization Problems
62 pages
Elastic and Bellows Transducers Overview
No ratings yet
Elastic and Bellows Transducers Overview
61 pages
MSC-IT Part I Regular Sem 1 Nov 2022
No ratings yet
MSC-IT Part I Regular Sem 1 Nov 2022
7 pages
Waves
No ratings yet
Waves
15 pages
Intersemiotic Texture Analysis
No ratings yet
Intersemiotic Texture Analysis
23 pages
Engineering Graphics Lab Manual 2021-22
No ratings yet
Engineering Graphics Lab Manual 2021-22
56 pages
Research Methodology Overview
No ratings yet
Research Methodology Overview
11 pages
Understanding GIS and Its Components
No ratings yet
Understanding GIS and Its Components
28 pages
Abaqus CAE Script Error Handling
No ratings yet
Abaqus CAE Script Error Handling
3 pages
AQA GCSE Chemistry Student Book
No ratings yet
AQA GCSE Chemistry Student Book
5 pages
Chapter 34B - Reflection and Mirrors II (Analytical)
No ratings yet
Chapter 34B - Reflection and Mirrors II (Analytical)
28 pages

Best Practices for Predicting Student Success

Uploaded by

Best Practices for Predicting Student Success

Uploaded by

Predicting Academic Success in Higher Education

Literature Review and Best Practices

You might also like