ITM Web of Conferences 44, 02002 (2022) [Link]
1051/itmconf/20224402002
ICACC-2022
Job Recommendation System Using Hybrid
Filtering
Aneesh Mulay1 , Shriyash Sutar1 , Jiten Patel1 , Aditi Chhabria1 , Snehal Mumbaikar1
1
Department of Computer Science and Engineering, Ramrao Adik Institute of Technology , DY Patil Deemed to be university
Nerul, Navi Mumbai, India
Abstract—As for today’s era, recruitment can be considered as users based on vast amount of information provided by the
one of most difficult process to undergo for job seeking candidate. users and the huge amount of data regarding jobs that is avail-
Many fresher candidates face issue while job recruitment process able to us via various Internet Resources. The paper proposed
to undergo which field of interest. The proposed system will help
the user to overcome this difficulties by matching their work using three types of filtering for providing recommendations
experience,skills and other details with appropriate companies - Content Based Filtering, Collaborative Filtering and
suitable for respective user. The system will also help experienced Hybrid Filtering. Since, our job recommendation system
users in getting their intended job on the basis of their last uses multiple recommendation algorithms, the disadvantages
job profile. The job recommendation algorithm developed is or lack of efficiency of one algorithm is covered by another
tedious nor complicated and will be using user-friendly approach
to implement job [Link] proposed system consist of user algorithm resulting in highly efficient recommendations.
dataset with various attributes and company dataset with com- The major objective of the paper is to build a model to rec-
pany details. The profile matching of user with the respective ommend a job using hybrid recommendation system which
companies can be done using various recommendation algorithms is the combination of content-based filtering and collaborative
such as content-based,collaborative and hybrid filtering. Since, filtering [Link] main motto is to make easy job search
the content-based and collaborative approach have their own
disadvantages, so here implement hybrid filtering which over- for users.
comes the disadvantages of the content-based and collaborative This recommendation depends on the user’s past experi-
filtering. The user can expect a well-proof recommendation ences as well as data from users with similar approach. The
from our model. The Project will focus of developing the job Recommendation model makes it easy for the users to get
recommendation system using hybrid filtering. As for today’s recommendation of various job profiles on basis of their past
era, recruitment can be considered as one of most difficult
process to undergo for job seeking candidate. Here, our job experiences, projects, internships, skills ,etc. The model
recommendation system comes in picture which neither is tedious will also help the experienced employees in recommending
nor complicated and makes use of user-friendly approach and various job profiles based on their experience and skill based
helps user to accomplish the task [Link] project will also be performance. The main reason being the freshers job recom-
focusing on developing the android application which will add mendation approach as some of the students may get confused
a better user interface. The Android application will be user
friendly and the user just have to fill in basic details such as his over various job profiles.
past years of experiences, project, internship, etc. That’s it,the The system not only considers the experience factor of
rest part of recommending the job to the users will be done safely individual but also the skills and project developed to make
by the recommendation model of this project. the job recommendation more assuring from user’s point of
Index Terms—recommendations, content-based, similarity, view. Hence, the user will not have any kind of uncertainty
jobs
regarding the job posting recommended by our model.
Nowadays an enormous amount of data is available on
I. I NTRODUCTION
the internet and Internet users can receive a huge amount of
Nowadays there is a rapid growth in Internet Technology, information. If the data volume or variety of data increases
job seekers are releasing their own personal information tremendously, then the individual user faces problems of
whereas enterprises are continuously posting for jobs on the excessive information, it can cause a problem to make the
Internet. Because of this, there is a dramatic increase in correct decisions.
availability of the job seeker’s personal information and the People are often confused on what roles they fit in or
recruiting information of various enterprises. Thus, the amount where they should start their job search, especially younger
of such type of data keeps on increasing and when compared people or graduates who are searching for their first job. For
to the increase in data rate, there is not much increase in the example, while the downsizing of a company, they usually
utilization rate of this data or resources. let of the people with less experience. Such people with less
Given access to such huge amount of data with high experience in a particular field can face a problem of where
veracity, an individual on his own may not be able to utilize they should start again or at what role are they supposed to
this data in an efficient manner. This paper introduces Job fit in in a particular company. Considering that this issue is
Recommendation System that basically recommends jobs to quite common for people with non-technical backgrounds. e.g.
© The Authors, published by EDP Sciences. This is an open access article distributed under the terms of the Creative Commons Attribution
License 4.0 ([Link]
ITM Web of Conferences 44, 02002 (2022) [Link]
ICACC-2022
Salesperson, marketing, etc. The paper also highlights the imputation in order to fillin
To Resolve such type of problems, Job recommendation matrix entries from other algorithms
system comes in the [Link] Job Recommendation sys- Karim, J. [12] have developed a Hybrid Recommender Sys-
tem can solve various problems by effectively finding user’s tem Using Collaborative Filtering and Knowledge-Intensive
probable requirements and select fascinating items from a vast Case-Based Reasoning.
amount of applicant information. E K Subramanian, Ramachandran. [13] proposed a Career
Recommender System for Students based on the performance
II. L ITERATURE R EVIEW and marks obtained by them in various subjects.
Ivens Portugal, Paulo Alencar, Donald Cowan. [14] have
Pradeep Kumar Singh, Pijush Kanti Dutta Pramanik, Avick
introduced detailed information regarding different types of
Kumar Dey, Prasenjit Choudhury [1] this paper provides a
recommendation systems and the various machine learning al-
comprehensive study on the RS covering the different recom-
gorithms associated with them. It also calculates the results of
mendation approaches, associated issues, and techniques used
these systems and compared them using perfomance measures
for information retrieval.
like precision, recall and f-measure.
[2].This paper gives an overview of increasing data and
explains users disadvantages to access useful recommendation III. T YPES O F R ECOMMENDATION T ECHNIQUES
informations. The paper depicts use of user’s requirements,
user’s factors like location, music, shopping into recommen- A. Content-based Recommendation System
dation system for giving possible recommendations. In Content-based Recommendation System, the final rec-
Ravita Mishra, Sheetal Vikram Rathi [3] explains three ommendations are generated based on user’s profile data. This
types of recommendation algorithmsi.e Collaborative filtering, system provides the suggestion based on user’s similarity with
Content Based filtering, Content Based Filtering and Hybrid the items. Mainly, the concept of Term Frequency Inverse
Filtering. It explains about various advantages and disadvan- Document Frequency(TFIDF) is used in information retrieval
tages of these algorithms. and content-based recommendation system. TFIDF basically
Tanya V. Yadalam, Vaishnavi M. Govda, Vandhita Shiva computes the frequency of words in respective documents.
Kumar, Disha Girish [4] explains about different methods such 1) Limitation of Content-based Recommendation System:
as Natural Language Processing, Cosine Similarity, Content • Sparsity problem is the situation which concerns about
Based Filtering,etc. Here, the paper depicts the use of Natural insufficient data present in the dataset.
Language Processing for sentiment analysis on user feedback • Generally, the content based approach faces sparsity
and also introduces encryption technique to handle data in a problem. Which means,this methods limits the recom-
more secured manner. mendation only to user specifics.
Marwa Hussien Mohamed, Mohamed Helmy Khafagy, Mo- • As the method only involves using the user related data,
hamed Hasan Ibrahim [5] explains about Content Based Fil- the dataset is insufficient as it does not involve rating
tering and Collaborative Filtering. This paper proposed about given by the other users.
various methods such as classification of data, cluster analysis, • The recommendation engine developed will not recom-
outlier detection, regression analysis, association analysis. mended anything besides user’s interest.
Greg Linden, Brent Smith, Jeremy York [6] gives a brief • Hence this approach only helps to recommend the result
summary about customer to customer and item to item recom- based on user’s interest and not based on other users
mandations for [Link] preferences.
Kunal Shah, Akshaykumar Salunke, Saurabh Dongare,
Kisandas Antala [7] presents an overview of the field of B. Collaborative Filtering
recommender systems and describes the present generation of In Collaborative Filtering, historical data of users is used
recommendation methods. to make the recommendations. Based on the explicit ratings
Gomez [9] introduce a business model while also building given by the users, the user to user similarity is calculated and
recommendation system. The paper also introduces the use then the corresponding items are recommended to the users.
of A/B test in recommendation system. It mentions the some 1) Memory-based Collaborative Filtering:
isses while designing and interpreting A/B tests.
• The idea behind implementing memory based collabora-
Thiengburanathum P, Cang S, Yu H [10] aims to build
tive filtering is to compute the similarity between different
destination recommendation system. It introduces use of var-
users based on their historical data.
ious algorithm to build recommendation model. The use of
• The approach works on ratings given by different users
weighted hybrid and cascade hybrid methods is more depicted
and then finally recommends the similar jobs to the users.
and used in the recommendation model.
Bell, R., Koren, Y., Volinsky, C [11] proposed the use of 2) Model-based Collaborative Filtering:
predicting the item raings using neighbourhood technique. • In Memory-based CF, SVD(Singular Value Decomposi-
The method consider using SVD to predict various user-item tion) a machine learning algorithm is used to predict the
ratings based on the neighbourhood user-item rating dataset. user’s ratings on unrated items.
2
ITM Web of Conferences 44, 02002 (2022) [Link]
ICACC-2022
• In this technique, various algorithms can be applied but user-user rating based similarity is calculated and model based
the most common and suitable approach would be matrix approach which consist of deep learning techniques , matrix
factorization model to apply SVD and reconstruct the factorization. This paper shows the implementation using
rating matrix Singular Value Decomposition(SVD) to predict user ratings
• Finally, top recommendations for particular users and on unrated jobs. Both of these approaches provide Top-n
produced based on their predicted ratings. recommendations .
3) Limitation of Collaborative Filtering: The results from above approaches are combined using
• The recommendation system experiences cold start prob- weighted average technique to form a Hybrid Recommenda-
lem as it does not have any relevant past data of the tion System which in turn will provide the Top-n recommen-
[Link] cold start problem is experienced in case of dations.
new user. When a new user register himself to the system,
the proposed model does not know about his intereset
and the user did not make any rating to the existing
companies. Due to this scenario the system won’t be able
to recommend anything to the user.
• This approach uses more amount of data consisting of
different perspective of different users.
• There is also sparsity problem which leads due to unde-
fined similarity between different users.
Fig. 2. System Architecture
V. E QUATIONS
A. Cosine Similarity
Cosine similarity is used to calculate the degree of similarity
between two vectors in n-dimensional space. It is widely used
in information retrieval.
a.b
sim(a.b) = (1)
|a|.|b|
Fig. 1. Content-Based Filtering and Collaborative Filtering Recommendations
B. Weighted Average
C. Hybrid Filtering using Weighted Average Technique The weighted hybrid technique combines the result of
As explained in the above two approaches, both col- both content-based and collaborative filtering techniques for
laborative and content-based filtering techniques have their comparison. This technique provides the results comparison
limitations. To resolve this, hybrid filtering techniques are when implementing both approaches combined and when each
used which is the combination of the above two mentioned approach works alone.
approaches. In Hybird filtering using Weighted average tech- Pn
i=1 w i X i
nique, a weighted score is calculated using the results of W = P n (2)
final recommendations of both collaborative and content-based i=1 w i
recommendations. The Hybrid Filtering helps in analyzing the C. Root Mean Square Error(RMSE)
results of recommendation systems when combined and when
Root Mean Square Error is the square of all the errors. The
each recommendation system works alone.
use of RMSE is very common, and it is considered an excellent
IV. S YSTEM A RCHITECTURE general-purpose error metric for numerical predictions.
Initially to perform the content-based filtering approach, we r
1 n ⇣ d i − f i ⌘2
need a dataset of set of companies. Hence we scraped the RM SE = ⌃ (3)
n i=1 i
data from ambitionbox to get the companies dataset. After
extraction, the paper depicts the use of TFIDF vectorization D. Mean squared error(MSE)
in the respective company dataset. With finding the words It measures the average of the squares of the errors—that is,
frequency, cosine similarity between each company attributes the average squared difference between the estimated values
the similarity matrix is computed. Based on the similarity and the actual value.
matrix Top-n content based recommendations are calculated. n
For Collaborative Filtering, the paper follows the use of X
M SE = (y i − y 0 i )2 (4)
two approaches which are memory based approach in which
i=1
3
ITM Web of Conferences 44, 02002 (2022) [Link]
ICACC-2022
E. Mean absolute error(MAE) B. Jobs Dataset:
Mean absolute error (MAE) is a measure of errors between
paired observations expressing the same phenomenon.
Pn
|yi − xi |
M AE = i=1
(5)
n
F. Precision
Precision is calculated by true positive divided by sum of
true positive and false positive. Precision is considered as
positive predicted value.
TP
P recision = (6) Fig. 3. Jobs Dataset 1.2
TP + FP
G. Recall
Recall is calculated by dividing true with sum of true
positive and false negative. Recall is the fraction of relevant
instances that were retrieved.
TP C. Final Results:
Recall = (7)
TP + FN
Company ID ColllabRating ContentRating WeightedAvg
H. F1 Measure 2593 0.2796 0.000 0.111
4690 0.2261 0.004 0.092
F1-measure is a measure of a test’s accuracy. It is calculated 1612 0.2230 0.000 0.089
from the precision and recall of the test, where the precision 645 0.2295 0.000 0.355
506 0.3884 0.333 0.088
is the number of true positive results divided by the number
2283 0.2286 0.00 0.097
of all positive results, including those not identified correctly, 3655 0.2224 0.014 0.083
and the recall is the number of true positive results divided by 4428 0.2099 0.000 0.364
the number of all samples that should have been identified as 908 0.3674 0.333 0.08
positive. 750 0.2003 0.004 0.20
2 ⇤ TP
F1 = (8)
2 ⇤ TP + FP + FN
VI. R ESULTS D. Evaluation with threshold
A. Rating Dataset:
1) Content-Based Evaluation: For the evaluation phase of
ID userId companyId rating this study, we have used a threshold based approach in order to
evaluate the relevance of the recommendations. First we select
0 1 1 4
multiple random users and iterate over them, In each iteration
1 1 3 4
we calculate their respective precision, recall and f-measure for
2 1 6 4
a range of thresholds. In order to do that, we first calculate
3 1 47 5
the threshold score. Then with the help of this threshold
4 1 50 5 score we calculate the number of true positives, false positives
5 1 70 3 and false negatives. Recommendations whose similarity scores
6 1 101 5 are greater than the threshold score are considered as true
7 1 110 4 positives and below that are considered as false positives.
8 1 151 5 Recommendations whose similarity scores are equal to zero
9 1 157 5 are considered as false negatives. With the help of these
10 1 183 5 values, we calculate the precision, recall and f-measure for
each threshold score of each user.
4
ITM Web of Conferences 44, 02002 (2022) [Link]
ICACC-2022
User 4590: dations whereas selecting a very low threshold would include
Threshold Precision Recall Fmeasure almost everything which is more diverse. So we calculate the
ideal balanced threshold value with the help of the average of
0.01 0.5654 0.5654 0.5654 all metrics that we found for each user.
0.02 0.553 0.5599 0.5564 2) Memory Based Collaborative: For User 5:
0.03 0.5404 0.5542 0.5472
0.04 0.4734 0.5213 0.4969 Threshold Precision Recall Fmeasure
0.05 0.4188 0.4907 0.4519 0.01 0.07061 0.1127 0.0868
0.06 0.4046 0.4821 0.4399 0.09 0.0706 0.112 0.0868
0.07 0.3786 0.4655 0.4176 0.11 0.0706 0.114 0.864
0.08 0.3708 0.4603 0.4107 0.13 0.070 0.119 0.0861
... ... ... ...
For User 19:
User 4812: Threshold Precision Recall Fmeasure
Threshold Precision Recall Fmeasure
0.01 0.0039 0.0067 0.0050
0.01 0.7312 0.7312 0.7312 0.09 0.0039 0.0067 0.0050
0.02 0.6616 0.71109 0.6854 0.11 0.0037 0.0067 0.0047
0.03 0.5686 0.6790 0.6189 0.13 0.0037 0.0067 0.0047
0.04 0.5406 0.66790 0.5975
0.05 0.5144 0.65679 0.57694 Average of All Users:
0.06 0.4294 0.61501 0.505712 Threshold Precision Recall Fmeasure
0.07 0.3956 0.5954 0.4753
0.08 0.374 0.5818 0.4553 0.01 0.115 0.161 0.134
... ... ... ... 0.09 0.0872 0.1266 0.1033
0.11 0.0784 0.115 0.093
Then we calculate the average of these metrics at each 0.13 0.07055 0.105 0.084
threshold for each user. Avg 0.08 0.125 0.10
User Precision Recall Fmeasure
Average of Threshold:
1721 0.3221 0.5204 0.3876
Threshold Precision Recall Fmeasure
4846 0.2947 0.9607 0.3918
4542 0.577819 0.7478 0.6487 0.01 0.109 0.151 0.126
200 0.4247 0.5594 0.4797 0.09 0.0685 0.100 0.0816
793 0.3878 0.5328 0.4455 0.11 0.0787 0.113 0.0930
4590 0.3854 0.4635 0.4194 0.13 0.0578 0.0863 0.0692
4812 0.3971 0.5765 0.4641 Avg 0.08 0.125 0.10
4730 0.3852 0.53801 0.4434
Average 0.41112 0.6244 0.4768 VII. C ONCLUSION AND F UTURE W ORK
The proposed Job Recommendation System using Hybrid
Then we calculate the average of these metrics at each Filtering will be the most reliable medium for fresher can-
threshold for all users as per the table below. didates to get various job recommendation. The system will
be also helpful for experienced users which will have no
Threshold Avg Precision Avg Recall Avg F-measure confusion or uncertainty from recommendation results. We
0.01 0.7398 0.7479 0.7438 will be using NLP for collecting various user feedback and
0.02 0.6555 0.7306 0.6894 then decide the efficiency of our model.
0.03 0.6052 0.7143 0.6534 For better user experience this model can be embedded
0.04 0.5441 0.6988 0.6059 inside android application which will be done using flutter sdk
0.05 0.5069 0.6853 0.5752 for android development. Further, the user experience can be
0.06 0.4522 0.6654 0.5276 improvised by implementing this inside deep learning model.
0.07 0.4204 0.6496 0.4989 R EFERENCES
0.08 0.3918 0.6362 0.4730
[1] Pradeep Kumar Singh, Pijush Kanti Dutta Pramanik, Avick Kumar Dey,
... ... ... ... Prasenjit Choudhury. ”Recommender Systems: An Overview, Research
Ideal Threshold: 0.08 Trends, and Future Directions”, 2021.
[2] Dr. Alka Singhal, Shivangi Rastogi, Nikhil Panchal, Shivani Chauhan,
The recommendation needs to be diverse in nature, selecting Shradha Varshney. ”Research Paper On Recommendation System”, Vol-
a high threshold would reduce the diversity of the recommen- ume 9, Issue 8, August 2021.
5
ITM Web of Conferences 44, 02002 (2022) [Link]
ICACC-2022
[3] Ravita Mishra, Sheetal Vikram Rathi, Efficient and Scalable Job Rec-
ommender System Using Collaborative Filtering, Researchgate, 19 May
2020. Mishra Ravita, Rathi Sheetal, Efficient and Scalable Job Recom-
mender System Using Collaborative Filtering, In: Kumar A., Paprzycki
M., Gunjan V. (eds) ICDSMLA 2019. Lecture Notes in Electrical Engi-
neering, vol 601. Springer, Singapore, 19 May 2020.
[4] Tanya V. Yadalam, Vaishnavi M. Govda, Vandhita Shiva Kumar, Disha
Girish, Career Recommendation Systems using Content based Filtering,
Published in 5th International Conference on Communication and Elec-
tronics Systems (ICCES), 10 July 2020.
[5] Marwa Hussien Mohamed, Mohamed Helmy Khafagy, Mohamed Hasan
Ibrahim, Recommender Systems Challenges and Solutions Survey, Pub-
lished in International Conference on Innovative Trends in Computer
Engineering (ITCE), 2 February 2019.
[6] Greg Linden, Brent Smith, Jeremy York,” [Link] Recommenda-
tion”, IEEE Computer Society, 2003.
[7] Kunal Shah, Akshaykumar Salunke, Saurabh Dongare, Kisandas Antala,
Recommender systems: An overview of different approaches to rec-
ommendations, International Conference on Innovations in information
Embedded and Communication Systems (ICIIECS), 2015
[8] Francesco Ricci, Lior Rokach, Bracha Shapira and Paul B. Kantor,
Recommender Systems Handbook, Springer, 2010.
[9] Gomez- Uribe CA, Hunt N (2016) The netflix recommender system:
algorithms, business value, and innovation. ACM Trans Manag Inf Syst
(TMIS) 6(4)
[10] Thiengburanathum P, Cang S, Yu H (2016) An overview of travel recom-
mendation system. In: IEEE 22th international conference on automation
and computing
[11] Bell, R., Koren, Y., Volinsky, C.: Modeling relationships at multiple
scales to improve the accuracy of large recommender systems. In: KDD
07: Proc. of the 13th ACM SIGKDD Int. Conf. on Knowledge Discovery
and Data Mining, pp. 95104. ACM, New York, NY, USA (2007).
[12] Karim, J., 2014, Hybrid Systems for Personalized Recommendations,
Research Challenges in Information Science (RCIS), 2014 IEEE Eighth
International Conference, May 2014
[13] E K Subramanian, Ramachandran.” Student Career Guidance System:
Recommendation of a course”. In “ International Journal of Recent
Technology and Engineering”, Volume 7, Issue 6S4,2019.
[14] Ivens Portugal, Paulo Alencar, Donald Cowan.” The Use Of Machine
Learning Algorithms In Recommender Systems: A Systematic Review”.
[15] Manish Kumar Singh, Dr. Dinesh Prasad Sahu.” Research Aspects Of
The System”. In International Journal For Research In Applied Science
Engineering Technology (IJRASET), Volume 5 Issue XI November 2017.
[16] Bhumika Bhatt, Prof. Premal J Patel, Prof. Hetal Gaudani. “A Review
Paper On Machine Learning Based Recommendation System. In “Inter-
national Journal Of Engineering Development And Research” Volume 2,
Issue 4 ,2014.
[17] Kaveri Roy, Aditi Choudhary And J. Jayapradha.” Product Recommen-
dations Using Data Mining And Machine Learning Algorithms “. In
ARPN Journal Of Engineering And Applied Sciences Vol. 12, No. 19,
October 2017.
[18] Kaustubh Kulkarni , Keshav Wagh, Swapnil Badgujar, Jijnasa Patil ,
A Study Of Recommender Systems With Hybrid Collaborative Filtering,
Volume: 03 Issue: 04 — Apr-2016.
[19] Priyanka.” A Survey Paper On Various Algorithm’s Based Recommender
System “. In IOSR Journal Of Computer Engineering, Volume 19, Issue
3, (May - June 2017).
[20] Mukta Kohar, Chhavi Rana.” Survey Paper On Recommendation Sys-
tem”. In (IJCSIT) International Journal Of Computer Science And Infor-
mation Technologies, Vol. 3 (2), 2012.
[21] Joeran Beel, Stefanlanger, Marcel Genzmehr, Bela Gipp, Corinna Bre-
itinger, Andreas Nurnberger,� “Research Paper Recommendation System:
A Quantitative Literature Survey”, International Journal On Digital Li-
braries (2015).
[22] N. Divya, S. Sandhiya, D.R. Anita Sofia Liz , P. Gnanaoli ,”A Collab-
orative Filtering Based Recommender System Using Rating Prediction”,
International Journal Of Pure And Applied Mathematics Volume 119 No.
10 2018.