0% found this document useful (0 votes)
60 views43 pages

Capstone Project Final Report - 19BCE2549

The document presents a thesis titled 'A Smart Conversational AI Chatbot to Guide College Freshers' submitted by Nishit Shah and Abhishek Mukherjee for their Bachelor of Technology in Computer Science and Engineering at VIT University. The chatbot aims to assist new students in navigating academic and administrative queries using a BERT-based model for accurate information retrieval. The project highlights the importance of providing a centralized resource for freshers to streamline their information-seeking process.

Uploaded by

sammy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
60 views43 pages

Capstone Project Final Report - 19BCE2549

The document presents a thesis titled 'A Smart Conversational AI Chatbot to Guide College Freshers' submitted by Nishit Shah and Abhishek Mukherjee for their Bachelor of Technology in Computer Science and Engineering at VIT University. The chatbot aims to assist new students in navigating academic and administrative queries using a BERT-based model for accurate information retrieval. The project highlights the importance of providing a centralized resource for freshers to streamline their information-seeking process.

Uploaded by

sammy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

A Smart Conversational AI Chatbot to

Guide College Freshers

Submitted in partial fulfillment of the requirements for the degree of

Bachelor of Technology

in

Computer Science and Engineering

by
Nishit Shah
19BCE2549
Abhishek Mukherjee
19BCE0598

Under the guidance of

Dr. Ilayaraja V

School of Computer Science & Engineering (SCOPE)

VIT , Vellore

May, 2023
DECLARATION

I hereby declare that the thesis entitled “A Smart Conversational


AI Chatbot to Guide College Freshers" submitted by me, for the award of the degree of
Bachelor of Technology in Computer Science and Engineering to VIT is a record of
bonafide work carried out by me under the supervision of Dr.Ilayaraja V.
I further declare that the work reported in this thesis has not been submitted and
will not be submitted, either in part or in full, for the award of any other degree or
diploma in this institute or any other institute or university.

Place: Vellore

Date : 13thMay 2023

Signature of the Candidate


CERTIFICATE

This is to certify that the thesis entitled “A Smart Conversational Al Chatbot to guide
college freshers'' submitted by Nishit Shah (19BCE2549), School of Computer Science
and Engineering, VIT, for the award of the degree of Bachelor of Technology in Computer
Science and Engineering, is a record of bonafide work carried out by him / her under my
supervision during the period, 01. 07. 2022 to 30.04.2023, as per the VIT code of academic
and research ethics.

The contents of this report have not been submitted and will not be submitted either
in part or in full, for the award of any other degree or diploma in this institute or any other
institute or university. The thesis fulfills the requirements and regulations of the University
and in my opinion meets the necessary standards for submission.

Place : Vellore
Date : Signature of the Guide

Internal Examiner External Examiner

Dr Vairamuthu S

B.Tech Computer Science and Engineering


i

ACKNOWLEDGEMENTS

We would like to express my sincere gratitude to VIT University for providing us with the
opportunity to work on this capstone project. The resources and support provided by the
university have been instrumental in the successful completion of this project.
We would also like to extend my heartfelt thanks to the Chancellor of VIT University Dr.
G. Viswanathan for his guidance and support throughout this project. His vision and
leadership have been a constant source of inspiration for us.
We would also like to express our appreciation to our project guide Dr.Ilayaraja V, who
has been instrumental in guiding us through the project. His expertise and insights have
been invaluable in shaping the direction of this project.
Finally, we would like to thank all the faculty members and staff of VIT University for
their support and encouragement. Their contributions have been essential to the success of
this project, and we are grateful for their assistance.

Abhishek Mukherjee
(19BCE0598)
Nishit Shah
(19BCE2549)
ii

Executive Summary
The project is a Question-Answering (QA) chatbot for freshers at VIT university. The
chatbot would help students to find information related to academics, events, fests, and the
FFCS course registration system,etc. The chatbot is based on the BERT (Bidirectional
Encoder Representations from Transformers) model and Squad (Stanford Question
Answering Dataset) project dataset. The chatbot works on the principle of 'Reading
Comprehension' to extract the required information from the dataset and provide accurate
answers to users' queries.
This chatbot uses retrieval-based methods to provide specific information according to the
user's needs. Retrieval-based chatbots are effective in providing specific answers to user
queries. The Squad dataset is a rich source of training and evaluation data, which can be
used to fine-tune the BERT model to generate contextual embeddings for the question-
answer pairs.
The chatbot takes user queries, retrieves relevant passages from the dataset, and provides
accurate answers in the English language. This system streamlines the process of resolving
students' questions by providing a single point of contact. The dataset utilized is the official
information extracted from the university's academic regulation document and website.
Overall, the project has the potential to be a useful tool for new students at VIT university.
iii

CONTENTS Page
No.

Acknowledgement i

Executive Summary ii

Table of Contents iii

List of Figures iv

List of Tables v

Abbreviations vi

Symbols and Notations vii

1 INTRODUCTION 1
1.1 Theoretical Background 1
1.2 Motivation 1
1.3 Aim of the Proposed Work 1
1.4 Objective(s) of the Proposed Work 2

2. Literature Survey 2
2.1. Survey of the Existing Models/Work 2
2.2. Summary/Gaps identified in the Survey 7
3. Overview of the Proposed System 9
3.1. Introduction and Related Concepts 9
3.2. Framework, Architecture or Module for the Proposed System 10
3.3. Proposed System Model 11
4. Proposed System Analysis and Design 12
4.1. Introduction 12
4.2. Requirement Analysis 13
4.2.1. Functional Requirements 13
4.2.1.1. Product Perspective 13
4.2.1.2. Product features 13
4.2.1.3. User characteristics 13
4.2.1.4. Assumption & Dependencies 14
4.2.1.5. Domain Requirements 14
4.2.1.6. User Requirements 14
4.2.2. Non Functional Requirements 15
4.2.2.1. Product Requirements 15
4.2.2.1.1. Efficiency (in terms of Time and Space) 15
4.2.2.1.2. Reliability 15
iii

4.2.2.1.3. Portability 15
4.2.2.1.4. Usability 15
4.2.2.2. Organizational Requirements 15
4.2.2.2.1. Implementation Requirements 15
4.2.2.2.2. Engineering Standard Requirements 16
4.2.2.3. Operational Requirements 17

● Economic 17

● Environmental 17

● Social 17

● Political 17

● Ethical 17

● Health and Safety 17

● Sustainability 17

● Legality 18

● Inspectability 18
4.2.3. System Requirements 18
4.2.3.1. H/W Requirements 18
4.2.3.2. S/W Requirements 18
5. Results and Discussion 18
6. References 24
APPENDIX A
iv

List of Figures

Figure No. Title Page No.


1 System Design 10

2 ER Diagram 11

3.1 Instructions 21
3.2 Sample Questions 21

3.3 Categories 22

3.4 Question Answering 22


3.5 Chat Continuation 23
Prompt
3.6 Ending the chat 23
v

List of Tables

Table No. Title Page No.


1.1 Literature Survey 2
1.2 Gaps and Summary of Papers 7
vi

List of Abbreviations
NLP Natural Language Processing
BERT Bidirectional Encoder Representations from
Transformer
SQuAD Stanford Question Answering Dataset
FFCS Fully Flexible Credit System
VIT Vellore Institute of Technology
LSTM Long Short Term Memory
DNN Deep Neural Network
API Application Programming Interface
AI Artificial Intelligence
ELMo Embedding from Language Model
MRC Machine Reading Comprehension
UI User Interface
LLM Large Language Model
AWS Amazon Web Services
CI/CD Continuous Integration / Continuous
Deployment
vii

Symbols and Notations


1

1. INTRODUCTION

1.1. Theoretical Background

Instead of offering direct contact with a real human agent, a chatbot or chatterbot is a
software programme that is used to conduct an online chat conversation using text or text-to-
speech. Many websites have installed a chatbot to satisfy the user’s dependability. It has
been heavily utilized to answer user inquiries. It is used in e-commerce,banking and every
other website where user queries need to be resolved. It's an interactive way to help users
without manual intervention. Chatbots are used in dialogue systems for a variety of tasks,
including information collecting, request routing, and customer service. While some chatbot
solutions employ advanced natural language processing, word categorization, and AI, others
only scan for broad keywords and construct responses using standard expressions taken from
a connected library or database.

1.2. Motivation

When freshers join college they need to figure out a lot of things in VIT like the location of
the academic buildings, events and fests and FFCS, the unique course registration system in
VIT. The students may have to approach different departments or authorities for answers to
their queries, resulting in time wastage and high inconvenience. Even though the resources
may be available in the student’s portal or the university’s official website, the course of
action involved in finding the correct context in a long handbook is a tedious and inefficient
process. The worst case scenario would be no resources available or concerned authority
absent during urgent cases. Question answering (QA) chatbots have become increasingly
popular as a means of providing personalized, timely and accurate information to users. In
recent years, there has been significant progress in the development of QA systems using
deep learning techniques, particularly the BERT (Bidirectional Encoder Representations from
Transformers) model.

1.3. Aim of proposed work

The aim of the work is to build a chatbot that serves as a one-stop solution for college
freshers to get their queries answered. The chatbot will be integrated with Vtop and accessible
to all students, with the primary target users being college freshers. The goal is to provide a
2

smooth, reliable and accurate process for students to get their questions answered without
having to approach specific authorities or seniors.

1.4. Objectives of Proposed Work

Solving queries of ambiguous freshers is our main objective to build this product. As seniors,
we observed many freshers asking questions regarding various aspects of college. We thought
of making a one-stop solution for all queries for freshers so that they don't have to approach a
specific authority or seniors to get their inquiries answered. Whatever the question may be,
they rely on the chatbot which would be very accessible to all freshers. Our target users are
college freshers, however, we do not intend to totally exclude seniors. Since we aim to
integrate to Vtop, all students must be able to access it. At the end of building the product, we
expect a smooth process for the student to get questions answered, and make it usable and
highly reliable. Additionally, we expect the chatbot to give the most accurate answer.

2. Literature Survey

2.1. Survey of Existing Models/Work

Title Author and Year Description

Implementation of a chatbot Lalwani, T., Bhalotia, S., The researchers have


system using AI and NLP Pal, A., Rathod, V., & implemented a college
Bisen, S. (2018) enquiry chatbot that is one
source for all types of
information available on the
college website. It has an
effective user interface and
answers queries related to
examination cell, admission,
academics, users’ attendance
and grade point average,
placement cell.
3

Title Author and Year Description

Feasibility Study of a Kim, J., Chung, S., Moon, The researchers have
BERT-based Question S., & Chi, S. (2022, developed a chatbot to
Answering Chatbot for December) extract specific information
Information Retrieval from from construction
Construction Specifications specifications as a user
wants. The model is built on
top of the BERT language
model. By taking advantage
of the pre-trained BERT,
user-wanted information
was successfully extracted
from construction
specifications.

A Proposed Chatbot Amer, E., Hazem, A., The researchers have


Framework for COVID-19 Farouk, O., Louca, A., developed a chatbot for
Mohamed, Y., & Ashraf, M. answering questions related
(2021, May) to COVID-19. The model
uses a pre-trained Google
BERT language model.
BERT is being used for text
classification to categorize
text input to various
categories.The actual
application is to query the
domain for answers. The
model is trained using
4

Title Author and Year Description

SQUAD 2.0, a famous


question answering dataset.

Intelligent College Enquiry Nikhath, A. K., Rab, M. A., The researchers have
Bot using NLP and Deep Bharadwaja, N. V., Reddy, developed an ‘Intelligent
Learning based techniques L. G., Saicharan, K., & Enquiry Bot’ to answer their
Reddy, C. V. M. (2022, user’s queries with valid
January) responses. The model used
for chatbots used Long Short
Term Memory (LSTM)
networks, a special kind of
Deep Neural Networks
(DNN). The paper goes
through various stages of
development of the chatbot.

Chatbot for college website Shivam, K., Saud, K., The authors propose a
Sharma, M., Vashishth, S., chatbot application,
& Patil, S. (2018) developed with Facebook
API and webhook, that
makes college related
information easily accessible
to the students. This saves
time and reduces the work
of staff and administration.
Moreover, it can be used by
students as well as parents to
5

Title Author and Year Description

get their queries answered.

Survey of various AI Singh, S., & Thakur, H. K. They discussed different


Chatbot based on (2020, June). techniques and approaches
technology used to developing a chatbot.
They have briefly described
these techniques and
approaches and given a
suitable real-time example
for each. They have
compared all these, based on
different criteria such as
technical workflow and
type.

A Survey on Conversational Hussain, S., Ameri Sianaki, They have examined


Agents/Chatbots O., & Ababneh, N. (2019, different design techniques
Classification and Design March) and approaches for task
Techniques oriented and non task
oriented chatbots. They have
further categorized these
approaches into three parts
and each of these parts
follow multiple design
techniques.
6

Title Author and Year Description

A Question Answering and Sreelakshmi, A. S., The researchers primarily


Quiz Generation Chatbot for Abhinaya, S. B., Nair, A., & focused on developing a
Education Nirmala, S. J. (2019, chatbot in the education
November) domain especially for
primary and middle school
students. They divided the
study into two modules;
Question Answering module
and Quiz generation module.

BERT for question Zhang, Y., & Xu, Z. (2019).


answering on SQuAD 2.0. The authors have provided a
Stanford University Report. series of experimentations
using the Huggingface
Pytorch BERT
implementations for
question answering on
Stanford Question and
Answering dataset.

Towards interpreting BERT Ramnath, S., Nema, P.,


for reading comprehension Sahni, D., & Khapra, M. M. In this paper, the authors
based QA. define a BERT layer’s role
or functionality using
Integrated Gradients and
perform preliminary
analyses on all layers.

Table 1.1 Literature Survey


7

2.2. Summary/Gaps Identified in the Survey

Title Summary Gaps Identified

Implementation of a chatbot This paper summarizes the The chatbot could not
system using AI and NLP fast and reliable college answer queries from all the
chatbot developed to fetch domains of college such as
information regarding sports and medical facilities,
examinations and hence reducing the benefits
placements in the university of this chatbot.
and displays answers to the
students.

A literature survey of recent This paper mentions Gap in terms of application


advances in chatbots. limitations and challenges between industry models
present in current and current advancement in
technology and provides the sector.
advice to improve future
chatbots.

Using the BERT algorithm, The chatbot needs to be

A Proposed Chatbot they have developed a improved in terms of

Framework for COVID-19 chatbot that can answer robustness and accuracy. It
queries related to COVID- can answer only common
19. Their proposed system questions related to COVID-
has been trained and tested 19.
with the Stanford Question
Answering Dataset.

Developed a smart chatbot Simple algorithm with

Intelligent College Enquiry with a web interface to medium level of accuracy.

Bot using NLP and Deep clarify queries asked by

Learning based techniques students, teachers and


8

parents. It is based on
Natural Language
Processing and Long Short
Term Memory.

Chatbot for college website A web application based Used database to store
chatbot to answer queries answers, questions,
related to college in order to keywords and feedback
reduce the workload of messages. This makes the
student and increasing time chatbot a restricted product,
and energy efficiency a only prescribed set of
questions could be
answered.

Survey of various AI Created a survey of chatbot Limited to generative


Chatbot based on technologies and approaches models of chatbots.
technology used to the latest model to make
developers understand them
easily.

The paper focuses on Does not provide technical

A Survey on Conversational chatbot classification with results to prove the changes


Agents/Chatbots their design techniques used from earlier chatbot and

Classification and Design in earlier and modern modern chatbot. It discusses

Techniques chatbots and how two main only theoretically and does
categories handle not focus on algorithms.
conversation context.

They have developed a Does not provide a sufficient

A Question Answering and chatbot to conduct quizzes answer. Cannot generate

Quiz Generation Chatbot for based on the uploaded questions from consolidated

Education document. The questions are data from all uploaded


generated from the context documents.
in the uploaded document
9

and presented in quiz


format.

BERT for question The authors have focused on Lack of understanding the
answering on SQuAD 2.0. machine reading mechanism to balance the
Stanford University Report. comprehension and BERT model’s performance on
for question answering. both has-answer and no-
answer predictions.

Towards interpreting BERT In this paper, the authors Potential for further research
for reading comprehension have introduced a new on the purpose and
based QA. preliminary layer which understanding the working
focuses more on contextual mechanism of BERT
understanding and relating to accuracy.
enhancing the answer
prediction.

Table 1.2 Gaps and summaries of papers

3. Overview of the Proposed System


3.1. Introduction and Related Concepts
In this project, we propose to develop a QA chatbot using BERT and the Squad project as our
dataset. The chatbot works on the principle of ‘Reading Comprehension’ to get the required
answer. It is a retrieval-based chatbot that can extract specific information as per what the
user wants. The chatbot takes user queries, retrieves relevant passages from our dataset of
information regarding VIT, and uses our trained model to provide accurate answers in the
English language. With this approach, a variety of questions can be responded to flexibly
without time-consuming manual tasks such as labelling.The Squad (Stanford Question
Answering Dataset) project provides a large corpus of over 100,000+ annotated question-
answer pairs for training and evaluating such models. We have fine-tuned the BERT model
on the Squad dataset to generate contextual embeddings for the question-answer pairs. The
dataset utilized is the official information extracted from the university’s academic regulation
document and website.
10

3.2. Framework, Architecture or Module for the Proposed System

Fig 1 System Design

Firstly, the chatbot displays a list of ten categories related to VIT with a designated number
for each of them. The categories are course registration/FFCS, hostels, examinations/
assessments, Pass/Fail criteria, facilities, credits, semester, summer/weekend/intersession
semesters, curriculum and events. The user is supposed to enter the category number from
which they want to ask questions. On entering their desired number, the related context to
that category is sent for processing as the knowledge base. However, hostels and facilities
have subcategories so there’s an additional step of asking for the designated number for
subcategories. Finally, the user is asked to enter his/her query related to the category chosen.
The question must be clear and specific with minimum grammatical errors.

Based on the categories mentioned during question processing and user’s choice, the
relevant text is chosen as context. The model is pre-trained on Bidirectional Encoder
Representations from Transformers (BERT) which helps it with contextual embeddings to
get the desired answer. Once the relevant context has been chosen to get the answer for the
query, the reader (model) reads through the context and extracts the answer by marking the
start and ending positions of the answer. The text enclosed within the markers is further used
by answer processing to be displayed to the user.
11

After the query has been entered and the related knowledge base has been chosen, both of
them are sent for answer extraction with help of our model. The answer which is extracted is
then further refined to display the most accurate answer.

3.3. Proposed System Model

Fig 2 ER Diagram
12

4. Proposed System Analysis and Design


4.1. Introduction
Bidirectional Encoder Representations from Transformers (BERT) creates contextual
embedding. The conventional workflow for BERT consists of two stages: pre-training and
fine-tuning. By training deep transformers on a carefully designed bidirectional language
modeling task, the pre-trained BERT representations can be fine-tuned later with one
additional output layer to perform well in a wide range of tasks without substantial task-
specific architecture. The BERT model architecture is a multi-layer bidirectional transformer
encoder. During the training process, for a given token, first, we convert it to the input
embeddings, a sum of the token embeddings, the segmentation embeddings, and the position
embeddings. Then we pre-trained the BERT model on the next sentence prediction task.
With our pre-trained BERT model, we can easily adapt it to span prediction tasks.

One of the main reasons for the good performance of BERT on different NLP tasks was the
Semi-Supervised Learning. This means the model is trained for a specific task that enables it
to understand the patterns of the language. After training the model(BERT) has language
processing capabilities that can be used to empower other models that we build and train
using supervised learning. ELMo Word Embedding for a word is the projection of a word to
a vector of numerical values based on its meaning, ELMo was different from these
embeddings because it gives embedding to a word based on its context i.e contextualized
word-embeddings.To generate embedding of a word, ELMo looks at the entire sentence
instead of a fixed embedding for a word. Elmo uses a bidirectional LSTM trained for the
specific task to be able to create those embeddings. This model is trained on a massive
dataset in the language of our dataset, and then we can use it as a component in other
architectures that are required to perform specific language tasks.

We have ten categories in our chatbot from which the user can ask questions. The categories
are course registration/FFCS, hostels, examinations/ assessments, Pass/Fail criteria,
facilities, credits, semester, summer/weekend/intersession semesters, curriculum and events.
For each of these categories, we have collected data from the official academic regulations
file from VTOP and VIT’s official website: https://vit.ac.in/ . This information has been
obtained from these resources and adjusted to the application requirements, without any
alteration in the meaning. The BERT model has already been trained on SQuAD 2.0 dataset.
13

SQuAD 2.0 (Stanford Question Answering Dataset) is a reading comprehension dataset that
consists of over 100,000+ examples of question-context-answer triplets.

Our model works to find the answer using a technology called Machine Reading
Comprehension. Machine reading comprehension (MRC) is a technology that scans
documents and extracts meaning from the text, just like a human reader. MRC uses neural
networks to model complex interactions between the context and the query, and to infer
answers from different parts of the document.

4.2. Requirement Analysis

4.2.1. Functional Requirements

4.2.1.1. Product Perspective


Our chatbot will be hosted on a website that will be integrated with
VTOP. The UI of the website is clean to give the student hassle free
interaction with the chatbot. The website has the same color scheme as
VTOP with a simple chat interface.

4.2.1.2. Product Features


Some of the features of the chatbot are:-
● Prompt response by chatbot
● No limit to number of questions to ask
● Clean and simple UI

4.2.1.3. User Characteristics


The user of this chatbot can be any student from VIT. Although any
student can use this chatbot, this chatbot is specially geared towards the
freshers who want to clear their doubts. The chatbot can save them
time by providing accurate answers without requiring them to manually
search through documents or websites.
14

4.2.1.4. Assumption & Dependencies


This chatbot assumes the student has an overview of various academic
systems and terminologies,hostels and other facilities in VIT. This
chatbot is hosted on a website so it requires internet connection to be
used. The chatbot is also dependent on good prompting for efficient
queries by users for good responses.

4.2.1.5. Domain Requirements


The chatbot applies Natural Language Processing (NLP) concepts on a
BERT based model, which is a pre-trained large language model
(LLM) that is already trained on SQuAD (Stanford Question
Answering Dataset). The chatbot requires knowledge on these
domains. Natural Language Processing (NLP) is a subfield of Artificial
Intelligence (AI) that focuses on enabling machines to understand and
interpret human language. BERT (Bidirectional Encoder
Representations from Transformers) is a pre-trained transformer-based
neural network architecture that can be fine-tuned for various NLP
tasks such as question-answering. SQuAD (Stanford Question
Answering Dataset) is a reading comprehension dataset made up of
questions posed by crowd workers on a collection of Wikipedia
articles, with the response to each question being a text segment, or
span, from the relevant reading passage.

4.2.1.6. User Requirements


The chatbot’s ability to provide accurate answers depends on the
quality of the questions it receives from users. In other words, if users
provide clear and specific prompts, the chatbot will be able to provide
more efficient and accurate responses.
15

4.2.2. Non Functional Requirements


4.2.2.1. Product Requirements
4.2.2.1.1. Efficiency (in terms of Time and Space)
Time complexity = O(NL^2)
= O(N* 24^2)
= O(N*576)
where L is number of layers in the model and N is
number of tokens.
Space complexity = O(V*E)
= O(30,522*768)
= O(23,457,216)
where V is vocabulary size and E is embedding size.
4.2.2.1.2. Reliability
The chatbot gives prompt answers from data collected
from official VIT sources. However the chatbot can
give wrong or incomplete answers due to poor framing
of questions or wrong answers fetched by chatbot.
We’ve included a popup to guide users on framing
proper questions to the chatbot.
4.2.2.1.3. Portability
The chatbot is hosted on a website that will be
integrated with VTOP. It can be accessed on any device
with an active internet connection.
4.2.2.1.4. Usability
The UI of the website is clean to give the student
hassle free interaction with the chatbot.

4.2.2.2. Organizational Requirements


4.2.2.2.1. Implementation Requirements
The implementation requirements for deploying this
chatbot are:
- Hosting: We will need a web server to host the Flask
application and make it accessible to the public. Wecan
16

use a cloud service provider such as Heroku, AWS,


Google Cloud, etc. or VIT’s own server. We’ll also
need a domain name and SSL certificate for your
website.
- Scaling: We’ll need to ensure that your Flask
application can handle multiple concurrent requests
from thousands of users. For this, we can use tools such
as Gunicorn, Nginx, Load Balancer, etc. to improve the
performance and scalability of our application.
- Security: We’ll need to protect your Flask app and
chatbot from unauthorized access, malicious attacks,
data breaches, etc which can be handled using
techniques such as encryption, authentication,
authorization, logging, firewall, etc. to secure the
website.
- Maintenance: The chatbot and website need to be
monitored and regularly updated to fix bugs, improve
features, enhance user experience, etc. We can use tools
such as Git, GitHub, CI/CD pipelines, etc. to manage
the development and deployment of our website.

4.2.2.2.2. Engineering Standard Requirement


The engineering standard requirements for this chatbot
are:
- Functionality: The chatbot performs its intended tasks
correctly and efficiently. It understands the user's input,
provides relevant and accurate responses, handles errors
and exceptions, and integrates with other systems or
services as needed.
- Usability: The chatbot is easy to use and interact with.
It has a clear and consistent user interface, provides
feedback and guidance, supports multiple languages
and channels, and adapts to the user's preferences and
context.
17

- Reliability: The chatbot operates without failures or


interruptions. It has high availability, performance, and
security. It does recover from unexpected situations and
maintain data integrity.
- Maintainability: The chatbot is easy to modify and
update. It has a modular and scalable architecture,
follows coding standards and best practices, uses
version control, and supports testing and debugging.

4.2.2.3. Operational Requirements


● Economic
The server used to host the website will incur some cost when
integrated with VTOP.
● Environmental
The website being accessed online poses no harm to the environment.
● Social
The website will have a positive social impact by helping VIT
students to clear their queries in an easy way. Social awareness about
the website can be helpful to students.
● Political
The chatbot does not conform to any political bias.
● Ethical
The chatbot complies with standard rules and regulations and does not
provide any illegal information. The answers are provided in a
complete ethical manner.
● Health and Safety
The chatbot does not compromise on any student’s health and safety.
The chatbot is very safe for use.
● Sustainability
The server hosting the chatbot can be run on green and renewable
energy sources to adopt energy saving features. The website also does
not have any adverse effects on the environment.
● Legality
All data accessed by chatbot is freely available to any VIT student so
18

it's all legally sourced. The website also does not store any user data.
● Inspectability
The website does not record previous questions asked by a user or any
such data related to the user. The website has some debugging checks
to handle errors or bad responses. The website does not have any
separate testing environment.

4.2.3. System Requirements

4.2.3.1. H/W Requirements


A computer that meets the requirements for running machine learning
models both online and offline.

4.2.3.2. S/W Requirements


Here are the domains and technologies used in the project:
1. Natural Language Processing
2. Deep Learning
3. Chatbots

The project was developed using Python programming language and


popular Machine Learning frameworks such as torch and
pytorch_transformers were utilized to develop neural networks that
support the aforementioned domains and technologies.

5. Results and Discussion


An example of how a typical BERT-based QA system works which is pre-trained with the
SQuAD dataset:
Context: Imperialism is a type of advocacy of empire. Its name originated from the Latin
word "imperium", which means to rule over large territories. Imperialism is "a policy of
extending a country's power and influence through colonization, use of military force, or
other means”. Imperialism has greatly shaped the contemporary world. It has also allowed
for the rapid spread of technologies and ideas. The term imperialism has been applied to
Western (and Japanese) political and economic dominance especially in Asia and Africa in
19

the 19th and 20th centuries. Its precise meaning continues to be debated by scholars. Some
writers, such as Edward Said, use the term more broadly to describe any system of
domination and subordination organized with an imperial center and a periphery.
Question: Imperialism is responsible for the rapid spread of what?
Ground Truth Answers: technologies and ideas technologies and ideas technologies and
ideas technologies and ideas technologies and ideas.
Prediction: technologies and ideas Ground truth Answers are the expected answers from the
context. Prediction- answer predicted by chatbot Here are the question-answer pairs our
model has tested on our dataset.
Our chatbot has been developed as a web application built on HTML and CSS. On accessing
the chatbot, the students are first shown a few instructions and a set of examples on how the
queries must be asked to obtain an accurate answer.
After reading the instructions, they are given a list of categories they can choose from to ask
questions. On choosing the category number they can ask questions related to that chosen
category. The chatbot replies with the most accurate answer.
After answering, the user is given a continuation prompt with three choices. They can ask
another question from the same category, or change the category or end the chat. Based on
the user’s choice, the chatbot repeats a certain set of functions unless the user wants to end
the chat.

a. Events -
Q: What is the purpose of Riviera?
A: To give our students a platform to showcase and shape their talent in various
technical and managerial areas.

b. Hostels -
Q: Can a student be permitted to stay as a day boarder outside the hostel?
A: No student will be permitted to stay as Day Boarder outside the campus unless a
parent stays with them.

c. Facilities -
Q: When are doctors available?
A: full time and part-time doctors are employed to render 24-hour service
20

d. Semesters -
Q: When are fall and winter semesters?
A: the fall semester will be from July to November and winter semester from
December to April.

e. Credits -
Q: What is the maximum and minimum credits allowed to register?
A: Students can register for a maximum of 27 credits or a minimum of 16 credits in a
regular semester

f. Summer/Weekend/Intersession Semester -
Q: How many credits are allowed in the summer semester?
A; between 6 and 8

g. Curriculum -
Q: What is the use of program elective courses?
A: courses give an opportunity for students to satisfy their aspirations in other
disciplines

h. Examinations/Assessments -
Q: How is a student's performance evaluated?
A: sequence of continuous assessment tests, assignments that include quizzes,
seminars, group discussions, class/take-home tasks, and Final Assessment Test

i. Pass/Fail criteria -
Q: If a student has cleared theory component can he register next level course?
A: if the student has cleared the theory component, he/she will be permitted to
register the next level course

j. FFCS/Course Registration -
Q: Can a student improve their grades?
A: who wish to improve their grades will be permitted to register the same course
again during a subsequent Course Registration process.
21

Fig 3.1 Instructions

Fig 3.2 Sample questions


22

Fig 3.3 Categories

Fig 3.4 Question Answering


23

Fig 3.5 Chat continuation prompt

Fig 3.6 Ending the chat


24

6. References
● Rao, G. M., Tripurari, V. S., Ayila, E., Kummam, R., & Peetala, D. S. (2022,
February). Smart-Bot Assistant for College Information System. In 2022 Second
International Conference on Artificial Intelligence and Smart Energy (ICAIS) (pp.
693-697). IEEE.
● Koundinya, H., Palakurthi, A. K., Putnala, V., & Kumar, A. (2020, July). Smart
College Chatbot using ML and Python. In 2020 International Conference on System,
Computation, Automation and Networking (ICSCAN) (pp. 1-5). IEEE.
● Shivam, K., Saud, K., Sharma, M., Vashishth, S., & Patil, S. (2018). Chatbot for
college website. Int J Comp Technol, 5(6), 74-77.
● Vikas, G. S. S., Kumar, I. D., Shareef, S. A., Roy, B. R., & Geetha, G. (2021,
October). Information Chatbot for College Management System Using Multinomial
Naive Bayes. In 2021 2nd International Conference on Smart Electronics and
Communication (ICOSEC) (pp. 1149-1153). IEEE.
● Rawat, B., Bist, A. S., Rahardja, U., Aini, Q., & Sanjaya, Y. P. A. (2022, September).
Recent Deep Learning Based NLP Techniques for Chatbot Development: An
Exhaustive Survey. In 2022 10th International Conference on Cyber and IT Service
Management (CITSM) (pp. 1-4). IEEE.
● Adamopoulou, E., & Moussiades, L. (2020, June). An overview of chatbot
technology. In IFIP International Conference on Artificial Intelligence Applications
and Innovations (pp. 373-383). Springer, Cham.
● Singh, S., & Thakur, H. K. (2020, June). Survey of various AI Chatbots based on
technology used. In 2020 8th International Conference on Reliability, Infocom
Technologies and Optimization (Trends and Future Directions)(ICRITO) (pp. 1074-
1079). IEEE.
● Maroengsit, W., Piyakulpinyo, T., Phonyiam, K., Pongnumkul, S., Chaovalit, P., &
Theeramunkong, T. (2019, March). A survey on evaluation methods for chatbots. In
Proceedings of the 2019 7th International conference on information and education
technology (pp. 111-119).
● Sreelakshmi, A. S., Abhinaya, S. B., Nair, A., & Nirmala, S. J. (2019, November). A
question answering and quiz generation chatbot for education. In 2019 Grace Hopper
Celebration India (GHCI) (pp. 1-6). IEEE.
25

● Hussain, S., Ameri Sianaki, O., & Ababneh, N. (2019, March). A survey on
conversational agents/chatbots classification and design techniques. In Workshops of
the International Conference on Advanced Information Networking and Applications
(pp. 946-956). Springer, Cham.
● Lalwani, T., Bhalotia, S., Pal, A., Rathod, V., & Bisen, S. (2018). Implementation of
a Chatbot System using AI and NLP. International Journal of Innovative Research in
Computer Science & Technology (IJIRCST) Volume-6, Issue-3.
● Rahman, A. M., Al Mamun, A., & Islam, A. (2017, December). Programming
challenges of chatbot: Current and future prospective. In 2017 IEEE Region 10
Humanitarian Technology Conference (R10-HTC) (pp. 75-78). IEEE.
● Caldarini, G., Jaf, S., & McGarry, K. (2022). A literature survey of recent advances
in chatbots. Information, 13(1), 41.
● Jimoh, K. O., Adebayo, O. Y., Akinfenwa, T. O., & Abimbola, I. B.
DEVELOPMENT OF A CLOUD BASED STUDENT INFORMATION CHATBOT
SYSTEM.
● Nikhath, A. K., Rab, M. A., Bharadwaja, N. V., Reddy, L. G., Saicharan, K., &
Reddy, C. V. M. (2022, January). An Intelligent College Enquiry Bot using NLP and
Deep Learning based techniques. In 2022 International Conference for Advancement
in Technology (ICONAT) (pp. 1-6). IEEE.
● Feasibility Study of a BERT-based Question Answering Chatbot for Information
Retrieval from Construction Specifications Kim, J., Chung, S., Moon, S., & Chi, S.
(2022, December).
● A Proposed Chatbot Framework for COVID-19 Amer, E., Hazem, A., Farouk, O.,
Louca, A., Mohamed, Y., & Ashraf, M. (2021, May). A proposed chatbot framework
for COVID-19. In 2021 International Mobile, Intelligent, and Ubiquitous Computing
Conference (MIUCC) (pp. 263-268). IEEE
● Zhang, Y., & Xu, Z. (2019). BERT for question answering on SQuAD 2.0. Stanford
University Report
● Shivam, K., Saud, K., Sharma, M., Vashishth, S., & Patil, S. (2018). Chatbot for
college website. Int J Comp Technol, 5(6), 74-77.
● Ramnath, S., Nema, P., Sahni, D., & Khapra, M. M. (2020). Towards interpreting
BERT for reading comprehension based QA. arXiv preprint arXiv:2010.08983.
26

APPENDIX A

File: index.html

<script src='https://use.fontawesome.com/releases/v5.0.13/js/all.js'></script>
<script>

var insmodal = get('#insmodal')


window.addEventListener('load', function () {
insmodal.style.display = 'block';
});
get('#closeins').addEventListener('click', function () {
insmodal.style.display = 'none';
})

var qnsmodal = get('#qnsmodal')


get('#qns').addEventListener('click', function () {
qnsmodal.style.display = 'block';
})
get('#closeqns').addEventListener('click', function () {
qnsmodal.style.display = 'none';
})

get('#ins').addEventListener('click', function () {
insmodal.style.display = 'block';
})
get('#closeins').addEventListener('click', function () {
insmodal.style.display = 'none';
})

var csrf_token = "{{ csrf_token() }}";


const msgerForm = get(".msger-inputarea");
const msgerInput = get(".msger-input");
const msgerChat = get(".msger-chat");
window.addEventListener('load', function () {
msgerInput.value = ''
});
const prompts = {
'Events/Clubs and Chapters': 'Do you have any more questions regarding riviera, gravitas,
pro shows or clubs/chapter?',
'Hostel Counselling': 'Do you have any more questions regarding counselling?',
'Paid Facilities': 'Do you have any more questions regarding the timings or prices of AC
gyms, swimming, karate, snooker, squash or synthetic tennis?',
'Free Facilities': 'Do you have any more questions regarding unpaid/free facilities
available?',
'Chettinad Hospital': 'Do you want to know about the location or services provided by the
hospital?',
'Semesters': 'Do you have any more questions regarding semesters?',
'Credit System': 'Do you want to know more about the credits?',
27

'Summer/Weekend/Intersesion semester': 'Do you have any questions regarding


intersession, summer or weekend semesters?',
'Curriculum': 'Do you like any more questions regarding categories in curriculum or
LTPJC components?',
'Examinations/Assessments': 'Do you have any more questions regarding cats, fats,
internals, marks calculation or re-evaluation?',
'Pass/Fail Criteria': 'Do you want to know more about passing criteria, F grade, N grade,
W grade, failing UE/PE courses?',
'FFCS/Course Registration': 'Do you have any more questions regarding grade
improvement course, add/drop or course withdrawal?',
'Attendance/Leave': 'Do you have any more questions about attendance, on duty or leave?'
}
categories = ['Events/Clubs and Chapters', 'Hostel Counselling', 'Paid Facilities', 'Free
Facilities', 'Chettinad Hospital', 'Semesters', 'Credit System',
'Summer/Weekend/Intersesion semester', 'Curriculum', 'Examinations/Assessments',
'Pass/Fail Criteria', 'FFCS/Course Registration', 'Attendance/Leave']

// Icons made by Freepik from www.flaticon.com


// const BOT_IMG = "https://image.flaticon.com/icons/svg/327/327779.svg";
// const PERSON_IMG = "https://image.flaticon.com/icons/svg/145/145867.svg";
const BOT_IMG = "./../static/styles/img/bot-svgrepo-com.svg"
const PERSON_IMG = "./../static/styles/img/student-fill-svgrepo-com.svg"
const BOT_NAME = "ChatBot";
const PERSON_NAME = "You";
var texts = new Array();
msgerForm.addEventListener("submit", event => {
event.preventDefault();
const msgText = msgerInput.value;
if (!msgText) return;

texts.push(msgText);
console.log(texts);
appendMessage(PERSON_NAME, PERSON_IMG, "right", msgText);

if (texts.length == 1) {
if (!isNaN(msgText) && msgText < 14) {
appendMessage("Chatbot", BOT_IMG, "left", "Enter your query");
}
else if (msgText > 13 || isNaN(msgText) || msgText < 1) {
appendMessage("Chatbot", BOT_IMG, "left", "Invalid Input, please enter again")
texts.pop()
}
}
else if (texts.length == 2) {
$.ajax({
url: '/',
type: 'POST',
headers: { 'X-CSRFToken': csrf_token },
contentType: 'application/json',
data: JSON.stringify({ choice: texts[0] - 1, question: texts[1] }),
28

success: function (response) {


console.log(response.result.answer);
async function myFunction() {
appendMessage("Chatbot", BOT_IMG, "left", response.result.answer)

}
myFunction().then(
function () {
appendMessage("Chatbot", BOT_IMG, "left", `1.
${prompts[categories[parseInt(texts[0]) - 1]]} </br> 2. Ask from another category </br> 3.
End the chat`)
},
);

}
});

}
else if (texts.length == 3) {
if (msgText == "1") {
appendMessage("Chatbot", BOT_IMG, "left", "Enter your query");
choice = texts[0]
texts = []
texts.push(choice);
}
else if (msgText == "2") {
texts = []
const cat = "The categories are :- </br> 1. Events/Clubs and Chapters </br> 2. Hostel
Counselling </br> 3. Paid Facilities </br> 4. Free Facilities</br> 5. Chettinad Hospital </br>
6. Semesters </br> 7. Credit System </br> 8. Summer/Weekend/Intersesion semester </br>
9. Curriculum </br> 10. Examinations/Assessments </br> 11. Pass/Fail Criteria </br> 12.
FFCS/Course Registration </br> 13. Attendance/Leave </br> Enter Category number for your
query."
appendMessage("Chatbot", BOT_IMG, "left", cat);
}
else if (msgText == "3") {
appendMessage("Chatbot", BOT_IMG, "left", "Thank you for using chatbot");
texts = []

}
else {
appendMessage("Chatbot", BOT_IMG, "left", "Invalid Input, please enter again");
texts.pop()

}
}
msgerInput.value = "";
});
29

function appendMessage(name, img, side, text) {


const msgHTML = `
<div class="msg ${side}-msg">
<div class="msg-img" style="background-image: url(${img})"></div>

<div class="msg-bubble">
<div class="msg-info">
<div class="msg-info-name">${name}</div>
<div class="msg-info-time">${formatDate(new Date())}</div>
</div>

<div class="msg-text">${text}</div>
</div>
</div>
`;

msgerChat.insertAdjacentHTML("beforeend", msgHTML);
msgerChat.scrollTop += 500;
}

// Utils
function get(selector, root = document) {
return root.querySelector(selector);
}

function formatDate(date) {
const h = "0" + date.getHours();
const m = "0" + date.getMinutes();

return `${h.slice(-2)}:${m.slice(-2)}`;
}

</script>
<script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/js/bootstrap.bundle.min.js"
integrity="sha384-
MrcW6ZMFYlzcLA8Nl+NtUVF0sA7MsXsP1UyJoMp4YLEuNSfAP+JcXn/tWtIaxVXM"
crossorigin="anonymous"></script>
</body>

</html>

File: context.py

facilities_paid = 'The swimming facility costs Rs 4720 and its opening time is 6:30 am- 8:30
am and 4:30 pm-7:30 pm. Indoor AC gym and Trendset AC gym costs Rs 4720 and Fitty AC
gym costs Rs 5900. Its opening time is 6 am - 9 am and 3 pm - 8 pm. The squash facility costs
30

Rs 3836 and its opening time is 6 am - 9 am and 3 pm - 8 pm. The karate facility costs Rs 2950
and its opening time is every Tuesday, Thursday and Friday from 6.30 pm to 7.30 pm. The
synthetic tennis facility costs Rs 3836 and its opening time is 6 am - 9 am and 3 pm - 8 pm.
The snooker facility costs Rs 3836 and its opening time is 2 pm - 8 pm. The shuttle service
costs Rs 15 from one point to any point within campus.'
passing = "A student receives a N grade if they have not passed any of the course's individual
components. Courses having F grade are considered as backlog. A course with 'F' grade has
to be re- registered in the subsequent semesters or special terms to clear the backlog. Students
with a F grade are eligible to register the next level course. With N grade, if the student has
cleared the theory component, he/she will be permitted to register the next level course (pre-
requisite is met). Both F and N have to be cleared by re-registering the same course in the
subsequent semester/inter semester/ summer semester / subsequent intra-semester. If the
student has cleared in one or more components but got N grade in that course, then the cleared
components are exempted from re-registration. Re-registration fee will be as per the university
norms existing at the time of reregistration (whole or a component of a course). Such a re-
registration fee is also applicable for project-only courses. When a course is re-registered
wholly, all earlier course evaluation marks shall be treated as cancelled/ reset. If a student fails
in a course due to lack of marks in the lab/project component of an embedded course, the
student has to re-register the lab/project component alone. Courses having ‘W’ grade (course
withdraw) will not be considered as backlog. Students who are debarred from FAT will be
given a N grade, for that course, in the grade sheet. If a student has secured a minimum of 40%
marks in the theory FAT alone has secured a minimum of 50% marks out of the total marks
awarded to the laboratory and/or project components and has secured a minimum of 50%
marks out of the grand total marks awarded to the course then he/she is declared passed. If a
student receives a F grade in a program elective(PE) course, and if the student wishes, he/she
is permitted to take another program elective(PE) course from the same basket, in lieu of
program elective(PE) course, the student had failed to clear, in subsequent semesters and clear
the new program elective(PE) course. If a student receives an ‘F’ grade in a university
elective(UE) course, and if the student wishes, he/she is permitted to take another university
elective(UE) course instead of the university elective(UE) course the student had failed to
qualify, in subsequent semesters and clear the new course. The student is permitted to choose
a program elective(PE) course from his/her curriculum instead of a failed university
elective(UE) course. An 'F' grade is given for a course if all of the course's individual
components are passed but the overall grade is below the passing range."
31

File : api.py
from flask import Flask, request, jsonify, render_template
from flask_cors import CORS
from bert import QA
from flask_wtf.csrf import CSRFProtect
import requests
import context
app = Flask(__name__)
app.config['SECRET_KEY'] = 'any secret string'
csrf = CSRFProtect(app)
csrf.init_app(app)
CORS(app)
model = QA("model")
@app.route('/', methods=['POST'])
def send_request():
url = 'http://localhost:8000/'
choice = request.json["choice"]
q = request.json["question"]
data = {'choice': choice, 'question': q}
if(not data['question'].endswith('?')):
data['question'] += '?'
print(data)
response = requests.post(url, json=data)
answer = model.predict(context.contexts[int(choice)], q)
print(answer)
if(answer['answer'] == '.'):
answer['answer'] = "Sorry, answer not found. Please rephrase the question and make sure
you have followed the instructions."

return jsonify({"result": answer})

@app.route("/")
32

def home():
return render_template('index.html')
if __name__ == "__main__":
app.run('0.0.0.0', port=8000)

You might also like