Role Of Data Mining in Education for Improving Students Performance for
Social Change
This paper attempts to cover every area of educational data mining. A paradigm for predicting
student success is also provided and assessed
The purpose of this study is to identify the factors that influence students' choice of an area of
study in higher education. Predictive tools and methods will be created to foresee students'
behavior, attitudes, and performance in order to guide students' higher education choices. Early
prediction of student performance benefits in improving student achievement.
Students and alumni in higher education encounter significant hurdles. One efficient approach to
solving these concerns is data analysis and presentation. Data mining can provide an institution with
the knowledge it needs to take action before a student drops out. It may help an institution allocate
resources more efficiently by providing an accurate prediction of how many students will attend a
given course. EDM is concerned with the numerous obstacles and problems linked with distinct parts
of the learning phenomena.
EDM assignments are created using educational data in student profiles and knowledge modeling.
Various practices, in contrast to the different educational datasets, leads to the unveiling of several
problems. It is essential to select the correct problem formulation technique corresponding to the
desired research objectives. EDM attempts to investigate the undiscovered patterns following an
examination of curriculum, learning behavior, and student data. Machine learning is a new area of data
mining that allows a computer program to grow increasingly accurate in predicting outcomes. ML
techniques are often divided into two types: supervised and unsupervised learning techniques employ
labeled training data for inference.
The Nearest Neighbor (NN) classifier treats each sample as a data point in a d-dimensional space,
where d is the number of characteristics. It is determined the distance between the provided test
example and all data points in the training set. The data point is then categorized based on the class
labels of its neighbors. Random forest categorization is popular due to its advantages. One advantage
of this strategy is that it may be used for both classification and regression.
A random forest classifier, in addition, can handle missing data and can be modeled in the case of
categorical data. Random forest classifiers are used in medical, finance, e-commerce, and the stock
market. The author of provides a method for categorizing students to estimate their final grade using
attributes retrieved from recorded data in an educational web-based system. They create, develop, and
test a collection of pattern classifiers, comparing their performance on an online course dataset. When
compared to non-GA, the accuracy of combined classifier performance is around 10 to 12 percent
higher when using the genetic algorithm.
The article authors used cluster analysis and K-means algorithm approaches to investigate the
association between university entrance examination results and success. They used an association
rule, a classification rule using a decision tree, classified the students using EM-clustering, and found
an outlier in the data using outlier analysis. They put their newfound knowledge to good use by
improving their performance.
Higher education institutions are increasingly aware that they are in the service industry, with students
serving as the key customers. Early prediction of student performance assists in the implementation of
measures to increase student achievement. This website tries to cover every aspect of educational data
mining. A model for predicting student achievement is also presented and evaluated.
Summary:
EDM intends to investigate previously unknown patterns after studying curriculum, learning behavior,
and student data. Data mining can provide an institution with the necessary information to respond
before a student drops out. It may help an institution allocate resources more efficiently by forecasting
how many students will enroll in a specific course. Random forest classification is often utilized due to
its advantages. The author proposes a method for categorizing students in order to estimate their final
grade. When compared to non-GA, combined classifier performance outperforms by 10 to 12 percent.
Early prediction of student performance facilitates the implementation of student achievement-
enhancing interventions.