Thesis
Thesis
Bachelor of Technology
in
Information Technology
by
Institute Guide
Prof. Sonali Guhe
Ranked 2 by ARIIA 2020, MHRD in Private or Self Finance Institutions, 5 Star Rating by
nd
April 2023
1
Declaration
We, hereby declare that the data preprocessing report titled “College Recommendation
System” submitted herein has been carried out by us towards partial fulfillment of requirement
for the award of Degree of Bachelor of Technology in Information Technology
Engineering. The work is original and has not been submitted earlier as a whole or in part for
the award of any degree / diploma at this or any other Institution / University.
We also hereby assign to G H Raisoni College of Engineering, Nagpur all rights under
copyright that may exist in and to the above work and any revised or expanded derivatives
works based on the work as mentioned. Other work copied from references, manuals etc. are
disclaimed.
2
Certificate
Institute Guide
Professor Sonali Guhe
Department of Information Technology Engineering
G H R C E, Nagpur
Head Director
Department of Information Technology Engineering
G H R C E, Nagpur GHRCE, Nagpur
3
ACKNOWLEDGEMENT
Thank You.
4
LIST OF FIGURES
5
19 6.2 Training procedure of all the models
6
LIST OF TABLES
Sr. No. Table No. Name of Tables Page No.
1 Dataset of Training the Model
2 6.1 Splitting dataset as peer the 60-20-20 rule
which means 60% of the images are in the
training set, 20% in validation set and 20 %
in the test.
7
INDEX
Sr. No. Content Page No.
1. Introduction
1.1 Preface
1.2 Brief Overview
2. Literature Review
3. Methodology
3.1 Comparative Study
3.1.1 VGG 19
3.1.2 ResNet 50v2
3.1.3 Inception v3
3.1.4 Densenet 121
3.2 Evaluation
3.3 Fine-Tuning The Best Performing Model
3.4 Deployment
4. Data Collection
4.1 NIH Chest X-ray Dataset
4.2 Overview of Dataset
4.3 NIH Chest X-ray Dataset consist of 13 Common
Disease Categorie
4.4 Dataset consists of the following data
5. Modelling and Implementation
5.1 Importing the required python libraries
5.2 Loading the dataset into the system memory for
processing
5.3 Processing the dataset
5.4 Analysing the data
5.5 Split the data into training and test sets.
8
5.7 Preparing the DenseNet Model, Loading the Weights
and Compiling It.
5.8 Implementing Checkpoint Settings and Starting the
Model Training.
9
8. Future Scope
9. References
10. Appendices
ABSTRACT
10
The process of choosing a college for higher education can be overwhelming and
challenging for students due to the plethora of options available. A college recommendation
system, powered by machine learning, can greatly assist students in making informed
decisions by providing personalized recommendations based on their preferences and
requirements.
This project aims to develop a college recommendation system that leverages machine
learning algorithms to analyze student data and recommend colleges that best match their
individual profiles. The system utilizes a variety of data, including academic performance,
extracurricular activities, interests, location preferences, and other relevant factors, to generate
accurate and customized recommendations.
To train and evaluate the recommendation system, a large dataset of student profiles
and college information is collected and preprocessed. Various machine learning algorithms,
including decision trees, support vector machines, and neural networks, are implemented and
compared to identify the most effective algorithm for the recommendation task. Feature
engineering techniques are also applied to enhance the system's performance and accuracy.
11
As a result, this project presents a college recommendation system that leverages
machine learning techniques to provide personalized and accurate recommendations to
students. The system utilizes a hybrid approach, combining collaborative filtering and content-
based filtering, and is trained and evaluated on a large dataset of student profiles and college
information. The system offers significant benefits to students, educational institutions, and
the broader educational ecosystem, and has the potential to positively impact the college
selection process for students worldwide. Further research and development can be
undertaken to continuously enhance the accuracy and effectiveness of the recommendation
system and extend its applicability to other domains in the educational landscape.
12
CHAPTER 1
INTRODUCTION
1.1 Preface:
13
In recent years, the landscape of higher education has undergone a significant
transformation, with a surge in demand for college and university admissions. Students today
are faced with an overwhelming number of choices when it comes to selecting a college or
university, making the decision-making process more complex than ever. In this era of
information overload, students need reliable and data-driven tools to assist them in making
informed decisions about their higher education options. This preface highlights the need for a
robust college recommendation system that leverages data preprocessing techniques to
provide accurate and personalized recommendations to students, ultimately aiding them in
making the right college choice.
The process of selecting a college or university has become increasingly complex and
challenging in recent years. With the expansion of higher education options and the
availability of vast amounts of information online, students are often overwhelmed by the
sheer volume of choices and the difficulty of evaluating various factors that influence their
decision. Factors such as location, campus facilities, rankings, courses offered, admission
requirements, and feedback from alumni are just a few of the many considerations that
students must weigh when making this important decision. However, the traditional methods
of gathering and analyzing this information, such as relying on rankings or recommendations
from peers, may not accurately reflect the individual needs and preferences of students,
leading to suboptimal choices and potential challenges in their academic journey.
In this context, the use of machine learning techniques and data preprocessing
approaches can provide a powerful solution to assist students in making informed decisions
about their college choices. By leveraging advanced data preprocessing techniques, such as
data cleansing, data transformation, feature engineering, and data integration, a college
recommendation system can effectively analyze and process vast amounts of data to generate
personalized recommendations tailored to the unique preferences and profiles of students.
Data preprocessing plays a crucial role in ensuring that the data used for generating
recommendations is of high quality, reliable, and relevant, and that it is processed in a manner
that reduces biases and promotes fairness.
The preface of this project sets the stage for the importance of data preprocessing in
the context of a college recommendation system. It highlights the challenges faced by students
in the college selection process and the limitations of traditional methods in providing
accurate and personalized recommendations. It also emphasizes the potential of machine
learning techniques and data preprocessing approaches to overcome these challenges and
provide students with a data-driven approach to college selection. The preface aims to create
awareness about the significance of data preprocessing and its relevance in the context of the
college recommendation system project, setting the stage for the subsequent sections that will
delve into the details of the project's objectives, methodology, and expected outcomes.
14
1.2 Brief Overview:
The data preprocessing phase of the project will involve several key techniques. Data
cleaning will be performed to identify and rectify inconsistencies, errors, and missing values
in the collected data. This may include handling outliers, dealing with duplicate or conflicting
data, and filling in missing values using appropriate imputation methods. Data transformation
will involve converting data into a consistent format, standardizing units, and normalizing
data for effective analysis. This may include converting categorical data into numerical
representations, scaling numerical data to a common range, and applying feature engineering
techniques to create new meaningful features from the raw data. Data integration will involve
combining data from multiple sources and creating a unified and comprehensive dataset for
analysis. This may include matching and merging data from different sources, resolving data
conflicts, and ensuring data consistency and integrity.
Ethical considerations will also be a key focus of the project. Data privacy and security
will be carefully addressed, and appropriate measures will be implemented to protect sensitive
information and comply with relevant data protection regulations. Bias in data and
recommendations will be thoroughly addressed, and efforts will be made to ensure that the
system is fair, transparent, and unbiased. The project will strive to provide equal opportunities
and access to all students, regardless of their background or characteristics, and promote
diversity, equity, and inclusion in the college selection process.
The outcomes of this project are expected to have significant benefits for various
stakeholders. Students will have access to a reliable and data-driven tool that can provide them
with personalized recommendations, reducing the complexity and stress of the college
15
selection process. Educational institutions can benefit from increased visibility and improved
student engagement, as the system can provide insights into the preferences and requirements
of prospective students, helping them tailor their offerings to attract the right students.
Policymakers can also utilize the findings of this project to gain insights into the factors that
influence students' college choices and make informed decisions about education policies. The
project can also contribute to the field of data preprocessing by applying and advancing state-
of-the-art techniques in handling large and complex datasets, addressing data quality issues,
and integrating data from diverse sources.
16
CHAPTER 2
LITERATURE REVIEW
The college selection process has become increasingly complex and challenging for
students due to the expansion of higher education options and the availability of vast amounts
17
of information online. In recent years, researchers have been exploring the use of machine
learning techniques to develop college recommendation systems that can assist students in
making informed decisions. Several studies have emphasized the importance of data
preprocessing in college recommendation systems. This involves cleaning, transforming, and
integrating data from multiple sources to improve system performance. Various techniques,
such as data normalization, feature selection, and data imputation, are used for effective data
preprocessing.
The study by Sheetal Girase, Varsha Naik, and Debajyoti Mukhopadhyay proposes a
user-friendly college recommendation system that uses user profiling and matrix factorization.
User preferences are collected and combined with demographic information to create user
profiles. Matrix factorization, specifically Singular Value Decomposition (SVD), is applied to
uncover latent factors in the user-item preference matrix. The system generates
recommendations based on similarity between user profiles and college profiles. The study
finds that the proposed system is effective in providing personalized recommendations and
outperforms other methods in terms of accuracy and relevance. The contributions of the study
include a user-friendly system that addresses limitations of traditional methods and highlights
the importance of data preprocessing. The study is aligned with existing literature on college
recommendation systems that use similar approaches.
18
limitations in the availability and accuracy of data, which should be considered in
implementing such systems.
The paper titled "The Institute Recommendation System Using Machine Learning" by
Asmita Orse, Nikhil Suryawanshi, Harsh Shrivastav, Pratik Bajpai, and Prof. Megha Patil,
published in 2022 with ISSN 2321-9653, presents a recommendation system for institutes
using machine learning techniques. The paper discusses the motivation behind the study, the
challenges associated with manual institute selection processes, and the potential benefits of
automated recommendation systems. The authors review existing research on
recommendation systems and machine learning, discussing different approaches and their
strengths and limitations. They emphasize the importance of data preprocessing and describe
the methodology of their proposed system, including data collection, preprocessing, feature
extraction, and model training. The authors evaluate the performance of their system using
metrics such as accuracy, precision, recall, and F1 score, and present their findings,
showcasing the effectiveness of their recommendation system. They compare their system
with other approaches, discuss strengths and limitations, and suggest potential areas of
improvement and future research directions.
Udhayakumar and Harisai proposed a college recommendation system that uses data
mining techniques, specifically the collaborative filtering algorithm-2 (CF2), to generate
personalized recommendations for students. The study extensively analyzes CF2 and
compares it with other collaborative filtering algorithms. The findings suggest that CF2
outperforms other methods in terms of recommendation accuracy. The study also addresses
challenges such as data sparsity and scalability, proposing techniques like active learning and
cluster-based recommendation to overcome them.
The College Recommendation System proposed by Vinit Jain, Mohak Gupta, Jenish
Kevadia, Prof. Krishnanjalin Shinde (2017) proposes a college recommendation system that
uses content-based filtering approach, leveraging features of colleges such as location,
program offerings, accreditation, and faculty expertise. The system includes a user interface
for students to input their preferences, and personalized recommendations are generated based
19
on calculated similarity scores. The study provides insights into the development of content-
based filtering for college recommendation, considering specific preferences and requirements
of students in the college selection process.
20
improvement and further research.
The study by Yara Zayed, Yasmeen Salman, and Ahmad Hasasneh (2022) proposes a
recommendation system that utilizes graduate student data to generate recommendations for
undergraduate programs at higher education institutions. The system leverages machine
learning techniques, specifically collaborative filtering, to provide personalized
recommendations based on the similarity between users' preferences and graduate students'
data. The study is unique in its use of graduate student data and builds on prior research in the
field of recommendation systems and higher education. The authors acknowledge the
limitations of using graduate student data and highlight the potential bias in the data. They
also suggest exploring other techniques such as content-based filtering and hybrid approaches
to enhance recommendation accuracy and relevance. The study contributes to the field by
proposing an innovative approach that can assist prospective undergraduate students in
making informed decisions about their academic and career paths.
The College Recommendation System is designed to assist students, parents, and other
entities in searching for top engineering colleges. By leveraging recommender system
techniques, this system aims to reduce the time and effort required for students to search and
compare colleges. The system utilizes data mining and machine learning algorithms to provide
personalized recommendations based on the data provided by users.
The system takes into consideration the information provided by students, such as their
preferences, location, and academic performance, and filters through the data to generate a list
of recommended colleges. This helps students in shortlisting colleges that align with their
preferences and requirements, making the college search process more efficient and effective.
21
Chapter 7
CONCLUSION
The College Recommendation System is a user-friendly web program that has the
potential to significantly reduce the workload of students and simplify the process of selecting
22
the right college. Through extensive data preprocessing and the utilization of advanced
machine learning algorithms, the system offers personalized recommendations based on the
student's preferences, assisting them in making informed decisions and narrowing down their
choices of colleges.
The system serves as a valuable guide for learners, showcasing the potential of data
preprocessing and machine learning techniques in enhancing the decision-making process. It
aims to provide accurate and relevant recommendations, based on the student's academic,
career, and personal preferences, thus empowering students to make well-informed choices
about their higher education.
Furthermore, the College Recommendation System has a promising future scope for
improvement and expansion. By incorporating more advanced data preprocessing techniques,
exploring additional machine learning algorithms, and incorporating real-time data, the system
can further enhance the accuracy and relevance of the recommendations. Additionally, the
system can be expanded to include other streams and disciplines beyond engineering, making
it a comprehensive tool for a wider range of students.
The system has the potential to create a positive impact on students' lives by guiding
them towards colleges that align with their academic, career, and personal aspirations. Its
intuitive user interface, interactive features, and comprehensive database of colleges make it a
valuable tool for students in their college search journey. The system has the potential to
contribute to improving college admission outcomes by assisting students in selecting colleges
that are a good fit for their interests, goals, and preferences.
23
Chapter 8
FUTURE SCOPE
The future scope of the college recommendation system holds immense potential for
advancements in data preprocessing and machine learning techniques. By incorporating multi-
dimensional recommendations, adaptive and contextual recommendations, social and peer
recommendations, predictive analytics, enhanced personalization, integration with virtual and
24
augmented reality, continuous monitoring and feedback, and expansion to global
recommendations, the system can evolve into a powerful tool for assisting students in making
informed decisions about their college choices.
Adaptive and Contextual Recommendations: The system could further evolve to provide
adaptive and contextual recommendations based on the changing needs and preferences of
students. For example, it could adapt recommendations based on the latest trends in the job
market, emerging technologies, or global events, ensuring that students receive up-to-date and
relevant recommendations for their future career prospects.
Social and Peer Recommendations: The system could incorporate social and peer
recommendations, leveraging social networks or student communities to provide
recommendations based on the experiences and feedback of current students or alumni. This
could provide valuable insights and perspectives from individuals who have firsthand
knowledge of the colleges, helping students make more informed decisions.
Predictive Analytics: With advancements in machine learning and predictive analytics, the
system could analyze historical data on student admissions, academic performance, and career
outcomes to predict the likelihood of a student's success in a particular college or program.
This could assist students in making data-driven decisions and selecting colleges that align
with their long-term career goals.
Integration with Virtual and Augmented Reality: As virtual and augmented reality
technologies continue to advance, the system could potentially integrate with these
technologies to provide virtual tours, simulations, or immersive experiences of college
campuses. This could allow students to virtually explore colleges and get a better sense of the
campus environment, facilities, and culture, helping them make more informed decisions.
25
Continuous Monitoring and Feedback: The system could incorporate continuous
monitoring and feedback mechanisms to collect data on student outcomes, satisfaction, and
feedback on recommended colleges. This could help in evaluating the effectiveness of the
system and making regular updates and improvements to ensure its accuracy and relevance for
students.
Expansion to Global Recommendations: The system could expand its scope beyond local
colleges and include recommendations for international colleges and universities. This could
be particularly beneficial for students interested in pursuing higher education abroad,
providing them with relevant recommendations based on factors such as location, reputation,
accreditation, and cost.
These advancements have the potential to greatly enhance the accuracy, relevance, and
effectiveness of the system, ultimately benefiting students by providing them with tailored
recommendations that align with their academic, career, and personal aspirations. As
technology continues to evolve, the college recommendation system can play a pivotal role in
helping students navigate the complex landscape of college selection and maximize their
chances of finding the right college fit for their future success.
26
Chapter 9
REFERENCES
27
1] Sheetal Girase, Varsha Naik, Debajyoti Mukhopadhyay, “A User-friendly College
Recommending System using User-profiling and Matrix Factorization Technique”,
DOI: 10.1109/CCAA.2017.8229779, 2017.
2] Akshar Panchal, Kushal Gosar, Homil Parmar, Rohini Nair, “College
Recommendation System using Data Mining and Natural Language Processing”, 2018.
3] Asmita Orse, Nikhil Suryawanshi, Harsh Shrivastav, Pratik Bajpai, Prof. Megha Patil,
“Institute Recommendation System Using ML”, ISSN: 2321-9653, 2022.
4] Mr. Y. Subba Reddy and Prof. P. Govindarajulu, “College Recommender system using
student’ preferences/voting: A system development with empirical study”, 2018.
5] Udhayakumar S, Harisai V, “College Recommendation System For Students Using
Datamining With Collaborative Filtering Algorithm–2”, ISSN: 2582-5208 , 2021.
6] Vinit Jain, Mohak Gupta, Jenish Kevadia, Prof. Krishnanjalin Shinde, “College
Recommendation System”, ISSN: 2278-0181, 2017.
7] Jadhav Sayali Pramod , Patil Durga Ananda , Kamble Sonali Arun, Shinde Sumit
Sahebrao, “College Recommendation System using Machine Learning”, -ISSN: 2395-
0056, Mar 2020.
8] “Developing and Evaluating a University Recommender System”, Mehdi Elahi, Alain
Starke, Nabil El loini, Anna Alexander Lambrix, 2022.
9] Vidish Sharma, Tarun Trehan, Rahul Chanana, Suma Dawn, “StudieMe: College
Recommendation System”, DOI: 10.1109/RDCAPE47089.2019.8979030, 2019.
10] “A Recommendation System for Selecting the Appropriate Undergraduate Program at
Higher Education Institutions Using Graduate Student Data”, Yara Zayed, Yasmeen
Salman, Ahmad Hasasneh, https://doi.org/10.3390/app122412525, 2022.
28
Chapter 10
APPENDICES
29
30
31
Instructions for Project report writing
Sequence of pages –
Part A
1 Cover page (Without copyright)
2 Inner first page (With copyright) ©G H Raisoni College of Engineering, Nagpur 2021
3 Declaration & Certificate – Color print with RGI logo as watermark(do not stretch)
4 Certificate from industry if any on industry letterhead describing duration and project work done,
achievement & stipend/placement details.
5 Acknowledgement
Note - Cover page & Inner page will be same except inner page will have copyright instruction
Part B (This part will have page numbers in roman i,e. i,ii,iii………..)
5 Abstract (1-2 pages on Complete Project) --- Roman page nos. (Start)
6 List of Figures
7 List of Tables
8 List of Symbols
9 List of Publications/Project competitions/Patents filed if any. Roman page nos. (End)
Part C (This part will have page numbers as 1,2,3…………..)
10 Index
11 Chapters(Numerals/page no. From 2 page onwards of every chapter and separator between chapters)
nd
Paper size A 4, Margin: Top 1, Bottom 1.25, Left 1.5, Right 0.8,
Page Nos. Bottom center,Spacing 1.5
Chapter & chapter title 16 Bold Upper case
Section & section title 14 Bold Title case
Sub-section & Sub-section title 12 Bold Title case
Title of figures, tables 12 Bold
Other written matter 12 Normal
32