Machine Learning Based Campus Placement
Predictior
A PROJECT REPORT
Submitted by
Srishti (22BCS11544)
in partial fulfillment for the award of the degree of
BACHELOR OF
ENGINEERING IN
Computer Science
Chandigarh University
October2023
BONAFIDE CERTIFICATE
Certified that this project report “Campus placement prediction” is the
bonafide work of “Srishti” who carried out the project work under my
supervision.
SIGNATURE SIGNATURE
Dr. Er. Amit Kumar
Head Of the Department Supervisor
Submitted for the project viva voice examination held on
INTERNAL EXAMINAR EXTERNAL EXAMINAR
ACKNOWLEDGEMENT
It gives me the privilege to complete this mini project. It is a great pleasure to express sincere and
deep gratitude towards my supervisor and guide Er. Amit Kumar for his valuable suggestions,
guidance, and constant support throughout the completion of this project named “Machine
learning based campus placement prediction”. This project, though done by me, wouldn't be
possible without the support of varied people, who with their cooperation have helped me in
bringing out this project successfully. I am very thankful to Chandigarh University for providing
me such a great opportunity to make such a wonderful project which can solve real-life problems
and extremely valuable hands-on experience along with crucial soft skills. I thank all involved in
this project for the successful completion of this project.
Srishti (22BCS11544)
(Student B.E. CSE, 3rd semester)
TABLE OF CONTENTS
List of Figures..............................................................................................................vi
Abstract…..................................................................................................................................1
CHAPTER 1. INTRODUCTION...........................................................................................2
1.1. Identification of Need of this Project.............................................................................2
1.2. Identification of Problem...............................................................................................4
1.4. Timeline.........................................................................................................................5
1.5. Organization of the report..............................................................................................7
CHAPTER 2. LITERATURE REVIEW/BACKGROUND STUDY 8
2.1. Review of the Literature....................................................................................................8
2.2. Existing Approaches.....................................................................................................12
2.3. Critical Appraisal of Existing Approaches...................................................................12
CHAPTER 3. DESIGN FLOW/PROCESS........................................................................11
3.1. Concept Generation...........................................................................................................11
3.2. Evaluation & Selection of Specifications/Features.....................................................14
3.3. Design Flow & Implementation Plan….....................................................................15
3.4. Implementation plan/methodology…........................................................................17
CHAPTER 4. Result and Validation................................................................................17
4.1. Objective Definition…................................................................................................19
4.2. Analyzing the Data......................................................................................................19
4.3. Future Scope...............................................................................................................19
Abstract
Every instructional institution is based on campus placement to help college students in reaching
their goals. System mastering type may be used to retrieve associated records from massive pupil
datasets. On this exam, a prescient model is fostered that may conjecture the positions for which
students are eligible based on their instructional and extracurricular achievements inside the
beyond. The version will even recommend additional competencies so as to be vital for destiny
recruitment, which will resource college students in their coaching for placement. It also offers
continuous trial results and discoveries, in addition to execution estimations expected for model
approval, assisting the accomplishment of the success of end result based totally schooling at
instructive foundations, which is the agreed first situation inside the current context. Placement of
students is one of the most important objectives of an educational institution. The reputation and
yearly admissions of an institution invariably depend on the placements it provides its students with.
That is why all the institutions arduously strive to strengthen their placement department so as to
improve their institution on a whole. Any assistance in this particular area will have a positive
impact on an institution’s ability to place its students. This will always be helpful to both the
students, as well as the institution. In this study, the objective is to analyze previous year's student's
data and use it to predict the placement chances of the current students. This model is proposed with
an algorithm to predict the same. Data pertaining to the study were collected from the same
institution for which the placement prediction is done, and also suitable data pre-processing
methods were applied. This proposed model is also compared with other traditional classification
algorithms such as Decision tree and Random Forest with respect to accuracy, precision and recall.
From the results obtained it is found that the proposed algorithm performs significantly better in
comparison with the other algorithms mentioned
CHAPTER 1
INTRODUCTION
1.1. Identification of Need of this Project
A campus placement prediction model using machine learning can be beneficial for several reasons
:
1. Efficient hiring process: Such a model can streamline the campus recruitment
process by accurately predicting the likelihood of a student getting placed based on
their academic performance, skills, and other relevant factors. It helps recruiters
identify top candidates quickly and efficiently.
2. Resource optimization: By leveraging machine learning algorithms, the model can
help optimize the allocation of resources such as time, effort, and budget during
campus placements. Recruiters can focus on candidates with higher chances of
placement, increasing the overall efficiency of the hiring process.
3. Data-driven decision-making: A machine learning model can analyze historical
placement data, identify patterns, and generate insights that facilitate data-driven
decision-making. Recruiters can make informed choices based on the predictions
and improve the success rate of placement.
4. 4Improve student preparation: Students can benefit from the prediction model by
gaining insights into their chances of getting placed and understanding the areas they
need to improve. They can focus on enhancing their skills or addressing any
weaknesses to increase their employability.
5. Fair and unbiased selection: Using a machine learning model can help ensure a fair
and unbiased selection process. By considering various objective factors and
removing human biases, the model provides equal opportunities to all candidates,
promoting inclusivity and diversity in campus placements.
6.
Early identification of talent: The model can identify talented individuals early in
their academic journey by analyzing their performance and potential. This allows
recruiters to establish connections and nurture relationships with promising
candidates before the actual placement season.
7. Customized career guidance: Machine learning models can analyze individual
student profiles and provide personalized career guidance based on their strengths,
weaknesses, and market trends. This tailored approach helps students make informed
decisions about their career paths and align their skills with industry requirements.
8. Enhanced employer branding: By leveraging advanced technology like machine
learning, organizations can showcase their commitment to innovation and data-
driven decision-making. This can enhance their employer branding and attract top
talent, as students perceive them as forward-thinking and technologically advanced
companies.
9. Efficient resource allocation for colleges: Colleges and educational institutions can
utilize the prediction model to allocate resources effectively. They can identify areas
where students need additional training or support, ensuring the curriculum aligns
with industry demands and maximizing the success rate of placements.
10. Continuous improvement and feedback loop: The model can contribute to
continuous improvement in the recruitment process by analyzing the outcomes and
providing feedback. Recruiters and institutions can use the insights to refine their
strategies, modify selection criteria, and enhance the overall effectiveness of campus
placements.
11. Time and cost savings: Predictive models reduce the time and effort spent on manual
screening and evaluation of a large number of candidates. This saves costs
associated with the recruitment process and enables recruiters to focus on other
value-added activities like building relationships and assessing soft skills during the
placement process.
12. Real-time updates and adaptability: Machine learning models can adapt and update
themselves based on real-time data and changing trends in the job market. This
ensures that predictions remain accurate and relevant, taking into account the
dynamic nature of the industry and evolving recruitment requirements.
By incorporating machine learning into the campus placement process, recruiters,
educational institutions, and students can benefit from improved efficiency, better decision-
making, and increased opportunities for successful placements
1.2. Identification of Problem
The problem addressed by a campus placement prediction model using machine
learning is the uncertainty and inefficiency in the traditional campus recruitment
process. Lack of accurate prediction: Traditional methods of evaluating candidates
for campus placements may not provide accurate predictions of their likelihood of
getting placed. This leads to uncertainty for both recruiters and students.Time-
consuming and resource-intensive process: Conducting interviews, assessments, and
evaluations for many candidates can be time-consuming and resource-intensive for
recruiters and educational institutions. It may result in inefficiencies and delays in
the placement process. Subjectivity and bias: Human biases can influence the
selection process, leading to unfair advantages or disadvantages for certain
candidates. Objective evaluation criteria are needed to ensure fairness and equal
opportunities for all students. Limited visibility of candidate potential: Recruiters may
not have sufficient visibility into the potential and capabilities of candidates beyond
their academic performance. They may miss out on talented individuals who possess
relevant skills but may not have outstanding academic records.
By addressing these problems, a campus placement prediction model using machine
learning aims to provide a more efficient, data-driven, and fair process for both
recruiters and students. It helps in accurate prediction of placements, resource
optimization, reduction of biases, personalized guidance, and overall improvement of
the campus recruitment experience.
1.3 TIMELINE
1.4.Organization of the Report
Chapter 1 Problem Identification: In this chapter, the project is introduced, and the
problem statement that was discussed previously in the report is described.
Chapter 2 Literature Review: In this chapter, we review a number of research studies
that improve our understanding of the issue. It also outlines what has already been
done to address the issue and what can be done moving forward.
Chapter 3 Design Flow/Process: Based on a review of the literature, this chapter
discusses the necessity and importance of the proposed task. The suggested goals and
Possible approaches are described. This illustrates the problem's relevance. Also, it
shows a logical and schematic method for resolving the research issue.
Chapter 4 Outcome Analysis and Validation: This chapter describes the various
implementation performance parameters. This chapter presents the experimental
findings. It provides an explanation of the significance of the findings.
Chapter 5 Conclusion and Future: In this chapter, the findings are summarized, the
best approach for carrying out the research is discussed in order to obtain the best
results, and the future study's objectives are stated, indicating how deeply the research
topic will be examined.
CHAPTER 2
LITERATURE REVIEW/BACKGROUND STUDY
2.1 Review of the literature:
[1] According to Liu, Yang, et al. "The Application of Machine Learning
Techniques in College Students Information System." 2018 International Conference on
Computer Science, Electronics and Communication Engineering (CSECE 2018).
Atlantis Press, 2018.
The best method for the ML model would be Random woodland Classifier and CatBoost
calculation to approve the methodologies. The calculations are applied on the
informational index and qualities used to construct the model.
[2] Ishizue, Ryosuke, et al. "Student placement and skill ranking predictors for
programming classes using class attitude, psychological scales, and code metrics."
Research and Practice in Technology Enhanced Learning 13.1 (2018): 1-20.
His paper exhibits the way that without examinations, AI assessments could additionally
be applied to dismantle understudy's position assessment considering parts like mental
scale, programming errands and the understudy tended to surveys. Choice tree depiction
model with choice tree gave a F- degree of 0.912 when 9 illustrative components were
utilized. Anyway the best-arranging model with SVM rank has a standardized limited
cumulated gain of 0.962 when 20 predictable components were utilized.
[3] Ahmed, S., Zade, A., Gore, S., Gaikwad, P., Kolhal, M. “Performance Based
Placement Prediction System.” IJARIIE-ISSN (O) - 4(3) 2018: 2395-4396.
This paper centers around the utilization of DM techniques in the field of coaching. A
TPO the board structure was organized which could truly see qualified understudies for
grounds drive. Choice tree C4.5 calculation was applied for affiliation's previous year
information and current need, which would by idea be vital to understudies since the
model would send advice to qualified newcomers in this manner assisting them with
knowing whether they are prepared for it. This would assist them with getting ready in
time for the ground drive. The characteristics utilized for the review were scholastic
history like rate marks, extent of limits, programming ability, correspondence
dominance, rational fitness and interest
[4] Raut. A. B., et al .Students execution figure using decision tree. Int. J. Comput.
Intell. Res. 13(7) 2017.
This paper features the utilization of information looking for execution supposition in a
specific subject by understudies utilizing a C4.5 choice tree assessment. The need for the
accommodating support of understudies for including great sense for forestalling
scholastic dangers was in addition included.
[5]Goyal, J., et al. "Position Prediction Decision Support System using Data Mining."
International Journal of Engineering and Techniques, 4(2) 2017.
In this paper, the creator organized a position presumption truly impressive association
with the assistance of information mining assessment. The model made helped in
tracking down situation probability as well as upheld expecting the degree of social
events the understudy could clear. Unsophisticated Bayes and Improved Naïve Bayes
were considered for the review. WEKA and NetBeans instruments were utilized for
information appraisal. Results showed that Improved Naïve Bayes gave an accuracy of
84.7% when showed up distinctively comparable to Naïve Bayes (80.96%) when 560
models dataset were considered for the review.
2.2 Possible approaches :
Logistic Regression: Logistic regression is a statistical method used to analyze a dataset
in which there are one or more independent variables that determine an outcome. In the
case of campus placement prediction, the independent variables could be the student's
academic performance, skills, and other relevant factors, while the outcome could be
whether or not the student gets placed.
Decision Trees: Decision trees are a popular machine learning algorithm used for
classification and prediction. They work by recursively partitioning the data into subsets
based on the values of the independent variables until a decision is reached. In the case
of campus placement prediction, decision trees could be used to predict whether or not a
student will get placed based on their academic performance, skills, and other relevant
factors.
Random Forest: Random forest is an ensemble learning method that combines multiple
decision trees to improve the accuracy and robustness of the predictions. In the case of
campus placement prediction, a random forest model could be trained using a dataset of
past placement records and relevant factors to predict the likelihood of placement for
new students.
Neural Networks: Neural networks are a class of machine learning algorithms inspired
by the structure and function of the human brain. They are particularly effective at
handling complex and non-linear relationships between variables. In the case of campus
placement prediction, a neural network model could be trained using a dataset of past
placement records and relevant factors to predict the likeliho od of placement for new
students.
CHAPTER 3.
Design Flow/Process
❖ DATA GATHERING
The information is accumulated from the school data set and the
position data set from the different divisions like software engineering
and designing, gadgets and interchanges, data science, structural
designing and mechanical designing.
❖ Pre processing
Data preprocessing is a technique that is used to convert raw data into a
clean dataset. The data gathered from different sources is in raw format
which is not feasible for the analysis.
Pre-processing for this approach takes 3 simple yet effective steps.
❖ Separating the categorical and numerical columns
❖ Removing unwanted columns
❖ Handling Null Values.
❖ Attribute selection
Some of the attributes in the initial dataset that was not pertinent
(relevant) to the experiment goal were ignored. The attributes name, roll
no, credits, backlogs, whether placed or not, b.tech % ,gender are not
used.The main attributes used for this study are credit , backlogs ,
whether placed or not, b.tech %.
❖ Cleaning missing values
In some cases the dataset contains missing values removing unwanted columns
. We need to be equipped to handle the problem when we come across
them. Obviously you could remove the entire line of data but what if
you're inadvertently removing crucial information?After all, we might
not need to try to do that. one of the foremost common plans to handle
the matter is to require a mean of all the values of the same column and
have it to replace the missing data.The library used for the task is called
Scikit Learn preprocessing. It contains a class called Imputer which will
help us take care of the missing data.
❖ Training and Test data
Splitting the Dataset into Training set and Test Set
Now the next step is to split our dataset into two. Training set and a Test
set. We will train our machine learning models on our training set, i.e
our machine learning models will try to understand any correlations in
our training set and then we will test the models on our test set to
examine how accurately it will predict. A general rule of the thumb is to
assign 80% of the dataset to training set and therefore the remaining
20% to the test set.
The calculations chose to arrange the information with irregular backwoods classifier
and catboost calculation. The information tests of occasions were utilized to anticipate
the arrangements class in which understudy might get chosen. The show examination
of the model was outlined with the help of evaluations like exactness, care, F1y-score
and accuracy. The show portrayal was investigated using a graph plot AUCy (yArea
under the Curve) ROXy (yReceiver Operating Characteristics) bend that uncovers the
sharp uttermost spans of a matched classifier structure as its separation limit. The best
calculation in view of the exhibition boundaries was chosen to foresee the position
classification of understudies. In light of the subtleties given by the understudies, the
arrangement class could be anticipated and the outcome would be shown alongside
the ideas for additional improvement.
Implementation plan/methodology
CHAPTER 4
RESULT ANALYSIS AND VALIDATION
4.1 INTRODUCTION
This chapter reviews the results and analysis of the qualitative data, the compilation
of the questionnaire and the results and analysis of the quantitative findings of the
study. The findings are also discussed in the light of previous research findings and
available literature, where applicable, in order to identify similarities and differences
between this study and previous studies and literature. A comprehensive description
of the research methodology was given in Chapter 2.
4.2 RESULTS AND ANALYSIS OF THE QUALITATIVE DATA
4.2.1 Introduction During the conceptual phase of this study, qualitative data was
collected. The first step involved personal and telephonic interviews in order to
investigate the development of model using EDA for effective manipulation of data
Prrocess outcome :
Checking for Tightly Correlated features
(Mutual classif i.e model dependence)
MODEL BUILDING :
DECISION TREE CLASSIFIER
DECISION TREE CASSIFIER
4.2 RESULT /CONCLUSIONS
Out of many models we have tested including Decision tree ,Random forest ,Logistic regression
,Random Forest Neural Network . Random Forest Model Gave the Best Results.
A campus placement prediction model using the Random Forest algorithm was developed and
evaluated. The Random Forest algorithm is a powerful machine learning technique that uses an
ensemble of decision trees to make predictions.
The dataset used for the model consisted of various features such as academic performance,
internships, communication skills, and personal attributes of students. The target variable was
whether a student was placed or not.
The Random Forest model was trained on a portion of the dataset and validated using cross-
validation techniques. The model achieved an accuracy of 93% on the validation set, indicating
its ability to accurately predict campus placements.
Through feature importance analysis provided by the Random Forest algorithm, it was identified
that academic performance, followed by internships and communication skills, were the most
influential factors in predicting campus placements. Other attributes such as personal skills and
extracurricular activities also contributed to the predictions but to a lesser extent.
The model's performance suggests that it can be a valuable tool for universities and students to
assess the likelihood of campus placements based on various factors. However, it is important to
note that the model's predictions should be used as a guide and not as a definitive guarantee of
placement outcomes.
In conclusion, the Random Forest model demonstrated strong predictive capabilities for campus
placement prediction. By considering important factors such as academic performance,
internships, and communication skills, the model can provide insights and help stakeholders
make informed decisions regarding campus placements. Further enhancements and validations
can be conducted to refine the model and improve its accuracy.
4.3 Future Scope:
Campus placement prediction is a field that has gained a lot of attention in recent years, thanks
to the increasing use of data analytics and machine learning in recruitment processes. With the
help of data analytics and machine learning algorithms, it is now possible to predict the
likelihood of a candidate getting placed in a particular company based on various factors such as
academic performance, skills, and personality traits. The future scope of campus placement
prediction is quite promising, as more and more companies are looking to streamline their
recruitment processes and make them more efficient. By using predictive analytics, companies
can save time and resources by focusing on the most promising candidates, thereby increasing
their chances of finding the right fit for their organization. In addition, campus placement
prediction can also help educational institutions in identifying areas where they need to focus on
improving their curriculum and training programs. By analyzing the performance of students
who are more likely to get placed, educators can identify the skills and knowledge that are most
in demand in the job market and tailor their programs accordingly. Overall, the future scope of
campus placement prediction is quite promising, and it is likely to become an increasingly
important tool in the recruitment and education industries.