Development of Faculty Qualification Analysis System Using Naive Bayes Algorithm
Development of Faculty Qualification Analysis System Using Naive Bayes Algorithm
Celia L.Verano5
Manila High School, Manila, Philippines
Abstract:- Qualification prediction is a crucial process in The current practice involves a manual review of printed
determining whether an applicant is qualified for a curriculum vitae, which can be time-consuming as both parties
particular position. However, traditional methods of need to thoroughly read through all the criteria. This manual
evaluation often rely on the experience and intuition of the approach occasionally leads to conflicts arising from differing
evaluator, which may not always be accurate. This study interpretations of standards or variations in awareness of
proposed the use of a supervised machine learning specific requirements. For some institutions, the search
approach, specifically the Naïve Bayes algorithm, to committee's faculty votes are more influential in tenure-track
predict faculty qualification based on a labeled dataset. faculty selection decisions than the chair or dean's votes, with
The developed Faculty Qualification Analysis System for academic accomplishments, interview performance, and
Perpetual Help College of Manila would allow users to presentation skills being key determinants [22]. Others opted
input appropriate test data and generate results of for AI adoption in their recruitment process because it leads to
qualified or not qualified. The system’s effectiveness and efficiency and qualitative gains for both clients and
acceptance had 4.3 and 4.4 ratings with verbal candidates, offering strategic insights into automation and AI
interpretation of very high and strongly acceptable. The implementation [23]. These technologies have the potential to
results of this study demonstrated the potential of machine save time and resources while improving the accuracy and
learning algorithms to improve the accuracy and efficiency of recruitment. In this study, the naive Bayes
efficiency of qualification prediction processes in algorithm improves the traditional method by selecting a
educational institutions. subset of attributes, enhancing classification accuracy while
reducing computational overhead [1].
Keywords:- Qualification Prediction, Machine Learning,
Naïve Bayes Algorithm. Similar to this is the use of Artificial intelligence in
Indian software companies which positively impacts the
I. INTRODUCTION recruitment process resulting in effective talent acquisition
and sustainable development [24]. The study is expected to
As educational institutions encounter heightened help organizations formulate recruitment strategies and policy
competition in attracting top-tier faculty members, Human interferences to align to develop its effective recruitment
Resources (HR) departments face the imperative of process to recruit qualified talent into the team to encounter
implementing efficient screening procedures to identify the competitive business and to develop a sustainable
most qualified candidates. Perpetual Help College of Manila is environment. Moreover, several educational institutions also
no exception. Its Human resource department must scrutinize have adopted predictive algorithm-based screening systems
applicants' credentials to ascertain compliance with the like, a college faculty recruitment system [25], an android
college's requirements before proceeding to demo teaching application that helps in sorting out candidates and providing
sessions. Nevertheless, the prevailing method of manually the best results in the recruitment cycle, and a comprehensive
reviewing printed curriculum vitae can prove time-consuming campus recruitment and placement system for optimizing the
and could introduce conflicts or errors during the assessment hiring process [26].
process. To overcome this challenge, this study is anchored on
the development of a Faculty Qualification Screening system With the mentioned successful implementation, a shift in
that uses machine learning and predictive analytics. This paradigm to ditch traditional methods of faculty recruitment
innovation can free HR personnel to do responsibilities and process is necessary. Adopting a machine learning and
devote their time to other important tasks. predictive algorithm-based screening system can streamline
processes, and save time, to ensure HR hires qualified
In the case of Perpetual Help College- Manila, before a candidates is necessary. With the help of machine learning
faculty member conducts a demonstration of teaching skills, using the naïve Bayes algorithm, automated faculty
the Human Resource Department needs to assess whether the recruitment is possible. The developed systems allow greater
applicant's credentials align with the university's faculty diversity to create a richer learning environment and
requirements. Collaboration between the Human Resource ensure continuity to be able to recruit top talent which
Department and the Dean is integral in the evaluation process improves the overall quality of education provided at the
to determine the eligibility of candidates for demo teaching. institution.
II. STATEMENT OF THE PROBLEM This study employs the predictive modeling theory as
illustrated in Figure 1, with a specific focus on supervised
This Study was Pursued to Answer the Following machine learning and the Naïve Bayes algorithm. The process
Problems: begins with data collection to gather pertinent information for
the subsequent modeling stages. The feature selection follows,
What are the challenges encountered in the existing involving the identification of crucial variables that impact the
screening process for faculty applicants to be qualified for model's outcome. Subsequently, an appropriate model is
teaching demonstrations? selected to ensure accurate predictions based on the chosen
How is the Naive Bayes Algorithm used in the data sets to features. The model is then trained using a designated training
predict qualified faculty applicants? set, and its performance is evaluated through carefully
How does the developed prototype address the challenges selected test data and diverse metrics, parameters, feature
found in the existing system? selection, or preprocessing techniques to enhance its efficacy
How do the I.T. experts assess the level of effectiveness of and optimization [27]. Once optimized, the model is ready for
the developed prototype in terms of ISO 25010 criteria deployment. This comprehensive workflow diagram provides
a systematic framework for constructing and implementing a
Functional – Correctness; supervised machine learning model for predictive purposes.
Efficiency/Performance -Software Capacity;
Usability – user error protection; Assumptions of the Study
Reliability – fault tolerance and recoverability; The current process for determining qualified faculty
Compatibility; applicants for teaching demonstrations has deficiencies and
Security; inefficiencies that can be addressed through the developed
Maintainability; and system.
Portability?
Scope and Delimitation
How do the users assess the acceptance of the developed The study focuses on the development and evaluation of
prototype in terms of ISO 25010 criteria: a faculty qualification analysis system utilizing the Naïve
Bayes algorithm at the College of Computer Studies-Perpetual
Functional suitability – Completeness and Help College of Manila. The key areas of emphasis include the
Appropriateness; screening process employed by the HR department for
Time-behavior and resource utilization; selecting faculty for teaching demonstrations, creating of
Appropriateness, learnability, operability, and user training data set, designing the system's database and
interface aesthetics; and lastly architecture, evaluating the system's effectiveness, and
Maturity and availability? assessing user acceptance. Several limitations are inherent in
this study. Firstly, the use of historical data from the Perpetual
III. THEORETICAL FRAMEWORK Help College of Manila and a Kaggle dataset may affect the
results to other institutions or disciplines when implemented.
The study does not evaluate hardware equipment for system
implementation, and the absence of an exploration of ethical
implications associated with machine learning in faculty
selection, are further limitations. Despite these constraints, the
study contributes valuable insights into the development and
implementation of a faculty qualification screening system
using the Naïve Bayes algorithm at Perpetual Help College of
Manila, with a need for cautious interpretation due to the
identified limitations.
IV. RESEARCH METHODOLOGY The project utilized specific software tools for its
development, operating on the Windows 10 platform. The
The research design and methodology employed in the primary programming language employed was Python, known
study used the descriptive developmental method. The for its simplicity, readability, and support for modularity and
descriptive method's primary focus is to describe, compare, code reuse. The Spyder IDE, included with Anaconda, served
analyze, and interpret existing data, aligning with the study's as the integrated development environment for editing,
objective of identifying criteria for predicting the faculty testing, and debugging the Python code. Additionally, CSV
qualification system and evaluating the software results. The files were utilized for data storage.
development Faculty Qualification Analysis System using the
Naïve Bayes Algorithm aimed to develop and implement an The user acceptance and IT expert evaluation were
efficient system for analyzing the credentials of faculty integral components of the project design, following the
applicants for demo teaching. The design and methodology ISO25010 standard. The evaluation considered factors such as
involved several key phases, data collection and preparation, the refined Naïve Bayes Algorithm, the system's functionality,
training data selection, algorithm development, system and its efficiency in predicting faculty qualification. The
development tools, evaluation and testing, and project project's deliverables encompassed various elements,
deliverables including data collected from printed curriculum vitae and
historical Kaggle datasets, a refined Naïve Bayes Algorithm, a
The project commenced with data collection and fully functional Faculty Qualification Analysis System, and
preparation. The researcher gathered both primary and comprehensive reports on user acceptance and system
secondary data, showcasing the current business process of effectiveness as evaluated by IT experts. These deliverables
faculty screening of Perpetual Help College of Manila. A collectively reflected the successful implementation of the
proposed system was designed based on the representation of designed project.
the data flow of the current system. This design aimed at
creating a more efficient and faster system, objectively Sources of Data
reducing processes and generating results. To enable the The data for this study was sourced from multiple
system to predict qualified faculty for teaching demonstration, channels, employing a mix of primary and secondary data.
the researcher identified a suitable training data set. The The primary sources included interviews and focus group
Kaggle data set, considered a foundational resource in data discussions, providing firsthand insights and opinions from
science, was chosen to provide historical data for the system's relevant stakeholders and among four (4) IT experts, four (4)
machine learning training phase. The study extensively College Deans, and two (2) HRD staff members. This strategic
explored and employed the Naïve Bayes algorithm for sampling approach allowed for targeted insights from
predicting the probability of faculty qualification. The individuals directly involved in the faculty qualification
algorithm development phase involved designing the system, screening process, ensuring a focused and relevant data
including the database, user interface, and logic design. The collection process. The gathered data from these interactions
features of the Faculty Qualification Analysis System were underwent thorough analysis, leading to the formulation of the
carefully deliberated based on the expected project output. study's objectives. The historical data were extracted from the
The development process adhered to a structured and Kaggle dataset, a repository of information for machine
systematic approach based on the Waterfall Model, a learning training while the data from the survey conducted
well-established model in the System Development Life Cycle was used to gauge the system's effectiveness and acceptance.
(SDLC). Figure 2 provides a visual representation of the Collectively, these diverse sources of data contributed to a
various steps undertaken in the development process. holistic and informed approach to conducting the study.
The acceptance level of the proposed system was both extraction and cleaning processes. Extraction involves
evaluated using ISO/IEC 25010:2011, which utilizes a the identification and selection of pertinent data for analysis
five-point scale to measure its level of acceptance. This while cleaning entails eliminating redundant or irrelevant
evaluation helps determine how well the system meets the information, rectifying errors, and converting the data into a
desired criteria, as presented in Table 2. required format. Subsequently, a comprehensive dataset is
created, incorporating both training and test data, it served as a
Table 2 Measurement Ratings - Level of Acceptance model for accurate prediction and analysis. The training
Assigned Numerical Categorical Verbal dataset plays a crucial role in instructing the machine learning
Point Ranges Response Interpretation model because it will rely on identified patterns and
5 4.51 – 5.00 Strongly Strongly relationships within the data to recognize trends. Following
Agree Acceptable the model training, an evaluation phase follows, where the
4 3.51 – 4.50 Agree Acceptable accuracy and effectiveness of the model are assessed by
testing it on a separate portion of the dataset, specifically the
3 2.51 – 3.50 Neutral Neutral
curriculum vitae dataset. This validation step ensures that the
2 1.51 – 2.50 Disagree Unacceptable system's predictions are aligned with the actual values stored
1 1.00 – 1.50 Strongly Strongly in the database. Finally, the Naïve Bayes algorithm is applied
Disagree Unacceptable to the prepared and validated dataset, enabling computation
and analysis to determine the qualification status of faculty
Instrumentation and Validation members. This comprehensive system architecture guarantees
The researcher utilized the international standard a methodical and robust process, encompassing data
ISO/IEC 2510:2011 to assess the acceptability of the software. acquisition, preprocessing, dataset creation, and algorithmic
This standard outlines eight key software characteristics, utilization for accurate faculty qualification prediction.
namely efficiency, compatibility, suitability, reliability,
maintainability, security, usability, and portability. These Data Gathering Procedures
characteristics served as the basis for evaluating the software's For this study, a comprehensive data-gathering
quality, ensuring that it met recognized industry standards. procedure was employed to ensure the acquisition of relevant
and valuable information. The process included the collection
System Architecture of data from diverse online sources, such as studies, research
papers, documents, and articles, with a specific focus on the
proposed Faculty Qualification Analysis System and the
application of the Naïve Bayes algorithm, as outlined by Putra
et al. in 2020. To assess the effectiveness and user acceptance
of the system, a survey method was chosen as the primary data
collection tool. The researcher meticulously designed survey
questions aligning with the standards set by ISO/IEC
25010:2011, reflecting various software quality
characteristics. Subsequently, the researcher personally
administered the questionnaires to the identified participants.
The survey responses were systematically collected and
documented for further analysis. The collected data underwent
rigorous statistical analysis to draw meaningful conclusions
and insights. This analytical process allowed the researcher to
evaluate the system's effectiveness and user acceptance,
providing a robust foundation for formulating conclusions and
offering recommendations for the study. The combination of
online source analysis and survey data collection ensured a
well-rounded approach to gathering information for the
research study.
Testing Procedures
The Black Box Testing for the Faculty Qualification
Analysis System was employed. This aimed at evaluating the
software's functionality from an end-user perspective without
delving into the internal code of the system. The testing
Fig 3 System Architecture process adhered to the system's requirements and expected
functionality, focusing on diverse aspects such as the user
The system architecture employed in this study adheres interface, APIs, database, security, client/server applications,
to a systematic process as seen in Figure 3. The initial step and overall system functionality. Further testing measures
involves acquiring historical data, which forms the were implemented, including Install/Uninstall testing. This
foundational basis for the analysis. Following this, the data comprehensive approach ensured a thorough evaluation of all
undergoes a meticulous preprocessing phase that includes system components to guarantee their excellent performance.
For a more concrete illustration, let's consider an outcome. This helps us determine the most likely class for the
example from javatpoint.com. Suppose we have a training given data based on the calculated probabilities. The training
dataset comprising weather conditions and a corresponding data set is to be used in predicting qualified faculty applicants.
target variable "Play," aiming to predict whether a player To come up with a training data set the researcher went
should play or not. The dataset is structured is presented in the through the process of cleaning and interpretation of data.
table below. Table 7 is the result.
Table 8 Sources of Data Upon evaluating the historical data, the researcher must
undertake a crucial step: data conversion, depicted in Table 9.
This process is essential to enable the Naive Bayes algorithm
to effectively analyze and process the data, ensuring accurate
predictions by appropriately transforming each attribute for
the algorithm. After the data conversion process is
successfully executed, the training data is now ready for the
next stage, where it can be employed to predict the desired
outcomes. With the completion of the data conversion
process, the training data is primed for the subsequent step,
involving the prediction of probabilities using the Naive
Bayes algorithm.
Table 9 Attributes
In Table 12, shows the result of the computed values of To Provide Another Test on the not Qualified Result,
the probabilities for sample Evidence 1 P(x). The P(x) are the Probability of Evidence 2 is Presented as Shown in Table
attributes that serves as the predictors. Using the formula of 13.
Naïve Bayes theorem and the use of Look-up table in Figure
11 with the given attribute condition, the probability of Table 13 Probability of Evidence 2
qualified and not qualified is computed.