0% found this document useful (0 votes)
154 views3 pages

Prediction of Student Academic Performance by An Application of K-Means Clustering Algorithm

In this paper data clustering technique named k-means clustering is applied to analyze student's learning behavior. The student's evaluation factor like class quizzes, mid and final exam assignment are studied. This study will help the teachers to reduce the drop out ratio to a significant level and improve the performance of students.

Uploaded by

Jeena Mathew
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
154 views3 pages

Prediction of Student Academic Performance by An Application of K-Means Clustering Algorithm

In this paper data clustering technique named k-means clustering is applied to analyze student's learning behavior. The student's evaluation factor like class quizzes, mid and final exam assignment are studied. This study will help the teachers to reduce the drop out ratio to a significant level and improve the performance of students.

Uploaded by

Jeena Mathew
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Volume 2, Issue 7, July 2012

ISSN: 2277 128X

International Journal of Advanced Research in


Computer Science and Software Engineering
Research Paper
Available online at: [Link]

Prediction of Student Academic Performance by an


Application of K-Means Clustering Algorithm
Md. Hedayetul Islam Shovon*, Mahfuza Haque
Computer Science and Engineering
Rajshahi Univesity of Engineering and Technology
Bangladesh
Abstract Data Clustering is used to extract meaningful information and to develop significant relationships among variables
stored in large data set/data warehouse. In this paper data clustering technique named k-means clustering is applied to analyze
students learning behavior. The students evaluation factor like class quizzes, mid and final exam assignment are studied. It is
recommended that all these correlated information should be conveyed to the class advisor before the conduction of final exam.
This study will help the teachers to reduce the drop out ratio to a significant level and improve the performance of students.
Keywords:

k-means, Database, academic performance etc.

I. INTRODUCTION
Data clustering is a process of extracting previously
unknown, valid, positional useful and hidden patterns
from large data sets (Connolly, 1999). The amount of
data stored in educational databases is increasing
rapidly. Clustering technique is most widely used
technique for future prediction. The main goal of
clustering is to partition students into homogeneous
groups according to their characteristics and abilities
(Kifaya, 2009). These applications can help both
instructor and student to enhance the education quality.
This study makes use of cluster analysis to segment
students into groups according to their characteristics.
II. DATA CLUSTERING
Data Clustering is unsupervised and statistical data
analysis technique. It is used to classify the same data
into a homogeneous group. It is used to operate on a
large data-set to discover hidden pattern and
relationship helps to make decision quickly and
efficiently. In a word, Cluster analysis is used to
segment a large set of data into subsets called clusters.
Each cluster is a collection of data objects that are
similar to one another are placed within the same cluster
but are dissimilar to objects in other clusters.
III. CLUSTERING IN HIGHER EDUCATION
Education is an essential element for the progression
and betterment of a country. Education makes a people
perfect by which he/she can participate in any
progressive work for the country. Education makes a
country civilized and well-mannered. Clustering in
higher education means it classifies the student by their
academic performance. Lack of deep and enough
knowledge in higher educational system may prevent
system management to achieve quality objectives, data
2012, IJARCSSE All Rights Reserved

clustering methodology can help bridging


knowledge gaps in higher education system.

this

IV. PROPOSED MODEL


In university academic performance are measured by
internal and external assessment. Internal assessments
are class test marks, lab performance, assignment, quiz,
attendance. External assessments are previous semester
grade and final semester grade. So, by taking the
internal assessment and previous exam grade and by
using data clustering technique we can predict what will
be the final grade of a student.
1.

If prev-grade=high,
quiz=good,assignment=complete, labperformance=good ,class-test=good,
attendance=regular and then final-grade=good

2.

If prev-grade=average, quiz=good,
assignment=incomplete lab-performance=good
Class-test=average and atendance=regular
then final-grade= average

3.

If prev-grade=low, quiz=average,
assignment=incomplete, lab-performance= poor
mid-term=low and attendance=irregular then
final-grade=low.

The proposed model try to identifies the weak


students before final exam in order to save them from
serious harm. Teachers can take appropriate steps at
right time to improve the performance of student in
final exam.
V. K-MEANS CLUSTERING ALGORITHM

Page | 353

Volume 2, Issue 7, July 2012

[Link]

K-means is an old and widely used technique in


clustering method. Here, k-means is applied to the
processed data to get valuable information .The pseudocode of k-means clustering is given below.
Step 1: Accept the number of clusters to group data into
and the
dataset to cluster as input values
Step 2: Initialize the first K clusters
- Take first k instances or
- Take Random sampling of k elements
Step 3: Calculate the arithmetic means of each cluster
formed in
the dataset.
Step 4: K-means assigns each record in the dataset to
only one of
the initial clusters
- Each record is assigned to the nearest cluster using a
measure of distance (e.g Euclidean distance).
Step 5: K-means re-assigns each record in the dataset to
the most
similar cluster and re-calculates the arithmetic mean of
all
the clusters in the dataset.

Here, I cluster student among their GPA, that means,


from GPA 2.00- 2.20 we have 8.33% student. From
2.20-3.00 student percentage is 16.67%. From 3.00-3.32
we have 28.33%. From 3.32-3.56 percentage is 25%
.The percentage is 21.67% between GPA 3.56-4.00.
The graphical representation of GPA and the
percentage of students among the student is given
below.
Graph 2: Number and percentage of students regarding to GPA

Fig. 1 Generalised Pseudocode of Traditional k-means .

VI. RESULT AND DISCUSSION


The model produced following results:
Graph.1: Shows the relationship between GPA and Attendance
ratio.

A. Data Arrangement in tables


We grouped the students regarding their final grades in
several ways 3 of which are:
Assign possible labels that are same as number of
possible grades.
Group the students in three classes High Medium
and Low.
Categorized the students with one of two class labels
Passed for grade above 2.20 and Failed for grade
less than or equal to 2.20.
Class

GPA

1
2
3
4
5

2.002.20
2.203.00
3.003.32
3.323.56
3.564.00

Table 1
No
of
student

5
10
17
15
13

Table 2
Class

GPA

No
student

High
Medium
Low

>=3.50
2.20<=GPA <3.5
<=2.20

28
27
5

of

Percentage
46.67
45
8.33

After clustering the student, we group the student


into three categories. One is High, second is Medium,
and the last one is Low.
Graphical representation of these three categories is
given below:
Graph 3: Shows the percentage of students getting high, medium
and
low GPA

Percentage

8.33
16.67
28.33
25
21.67

2012, IJARCSSE All Rights Reserved

Page | 354

Volume 2, Issue 7, July 2012

VII.

[Link]

CONCLUTSION AND FUTURE WORK

In this study we make use of data mining process in a


students database using k-means clustering algorithm
to predict students learning activities. We hope that the
information generated after the implementation of data
mining technique may be helpful for instructor as well
as for students. This work may improve students
performance; reduce failing ratio by taking appropriate
steps at right time
to improve the quality of education. For future work, we
hope to refine our technique in order to get more
valuable and accurate outputs, useful for instructors to
improve the students learning outcomes.
REFERENCES

[1] Alaa el-Halees (2009) Mining Students Data to


Analyze e-Learning Behavior: A Case Study.
[2] [Link]., (2003) Predicting Student
Performance: An Application of Data Mining
Methods With The Educational Web-Based System
Lon-CAPA 2003 IEEE, Boulder, CO.
[3] Connolly T., C. Begg and A. Strachan (1999)
Database Systems: A Practical Approach to Design,
Implementation, and Management (3rd Ed.).
Harlow: Addison-Wesley.687
[4] Erdogan and Timor (2005) A data mining
application in a student database. Journal of
Aeronautic and Space Technologies July 2005
Volume 2 Number 2 (53-57)
[5] [Link] (2007)Examining online learning
processes based on log files analysis: a case study.
Research, Refelection and Innovations in
Integrating ICT in Education.
[6] Henrik (2001) Clustering as a Data Mining Method
in a Web-based System for Thoracic Surgery:
2001
[7] Han,J. and Kamber, M., (2006) "Data Mining:
Concepts and Techniques", 2nd edition. The
Morgan Kaufmann Series in Data Management
Systems, Jim Gray, Series Editor.
[8] Kifaya(2009) Mining student evaluation using
associative
classification
and
clustering.
Communications of the IBIMA vol. 11 IISN 19437765.
[9] ZhaoHui. Maclennan.J, (2005). Data Mining with
SQL Server 2005 Wihely Publishing, Inc

2012, IJARCSSE All Rights Reserved

Page | 355

You might also like