0% found this document useful (0 votes)
17 views48 pages

19bce2521 VL2022230103417 Pe004

The document presents a thesis titled 'AIRPRED: An Airline Ticket Price Prediction Web App' submitted by Satkar Acharya for a Bachelor of Technology in Computer Science and Engineering at VIT University. It details the development of a web application utilizing machine learning algorithms, specifically the Random Forest Algorithm, to predict airline ticket prices based on user inputs. The project aims to assist travelers in making informed decisions to save money on airline tickets by providing accurate price predictions.

Uploaded by

sammy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views48 pages

19bce2521 VL2022230103417 Pe004

The document presents a thesis titled 'AIRPRED: An Airline Ticket Price Prediction Web App' submitted by Satkar Acharya for a Bachelor of Technology in Computer Science and Engineering at VIT University. It details the development of a web application utilizing machine learning algorithms, specifically the Random Forest Algorithm, to predict airline ticket prices based on user inputs. The project aims to assist travelers in making informed decisions to save money on airline tickets by providing accurate price predictions.

Uploaded by

sammy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

AIRPRED:An Airline Ticket Price Prediction Web App

Submitted in partial fulfillment of the requirements for the degree of

Bachelor of Technology

in

Computer Science and Engineering

by
Satkar Acharya
19BCE2521

Under the guidance of

Dr.Narayanan Prasanth N

School of Computer Science & Engineering (SCOPE)

VIT , Vellore

May, 2023
DECLARATION

I hereby declare that the thesis entitled “AIRPRED:An Airline


Ticket Price Prediction Web App" submitted by me, for the award of the degree of
Bachelor of Technology in Computer Science and Engineering to VIT is a record of
bonafide work carried out by me under the supervision of Dr.Narayanan Prasanth N.
I further declare that the work reported in this thesis has not been submitted and
will not be submitted, either in part or in full, for the award of any other degree or
diploma in this institute or any other institute or university.

Place: Vellore

Date : 20thMay 2023


Signature of the Candidate
CERTIFICATE

This is to certify that the thesis entitled “AIRPRED:An Airline Ticket Price Prediction
Web App '' submitted by Satkar Acharya (19BCE2521), School of Computer Science and
Engineering, VIT, for the award of the degree of Bachelor of Technology in Computer
Science and Engineering, is a record of bonafide work carried out by him under my
supervision during the period, 01. 07. 2022 to 30.04.2023, as per the VIT code of academic
and research ethics.

The contents of this report have not been submitted and will not be submitted either
in part or in full, for the award of any other degree or diploma in this institute or any other
institute or university. The thesis fulfills the requirements and regulations of the University
and in my opinion meets the necessary standards for submission.

Place : Vellore
Date :20th May 2023 Signature of the Guide

Internal Examiner External Examiner

Dr Vairamuthu S
(Head Of Department)
School Of Computer Science and Engineering
ACKNOWLEDGEMENTS

I would like to express my sincere gratitude to VIT University for providing me with the
opportunity to work on this capstone project. The resources and support provided by the
university have been instrumental in the successful completion of this project.
I would also like to extend my heartfelt thanks to my project guide Dr.
Narayanan Prasanth N who has been instrumental in guiding me through the
project. His expertise and insights have been invaluable in shaping the direction of
this project. I would also like to thank Chancellor of VIT University Dr. G.
Viswanathan. His vision and leadership has been a constant source of inspiration for
me.
Finally, I would like to thank all the faculty members and staff of VIT University for their
support and encouragement. Their contributions have been essential to the success of this
project, and I am grateful for their assistance.

Satkar Acharya
(19BCE2521)
Executive Summary
This project is an Web application which has been designed to help airline travellers to
save money by providing them wtih potential predicted price for the date that they want to
travel .This web app uses Machine Learning algorithms to train a model which would be
used to predict the prices, three algorithms are compared by using it on train and test data
and the algorithm with the best score(Random Forest Algorithm) is used.

The dataset used is a great source of training and evaluation data which was used to fine
tune the ML model to generate a more accurate prediction.

This web app uses user input to provide specific information according to users needs , for
example ,the user has to select the particular airline ,the date and time ,the source and
destination etc to predict the price with respect to the given data..
CONTENTS Page
No.

Acknowledgement i

Executive Summary ii

Table of Contents iii

List of Figures iv

List of Tables v

Abbreviations vi

Symbols and Notations vii

1 INTRODUCTION 1

1.1 Theoretical Background 1


1.2 Motivation 1
1.3 Aim of the Proposed Work 1
1.4 Objective(s) of the Proposed Work 1

2. Literature Survey 2
2.1. Survey of the Existing Models/Work 2
2.2. Summary/Gaps identified in the Survey 5
3. Overview of the Proposed System 7
3.1. Introduction and Related Concepts 7
3.2. Framework, Architecture or Module for the Proposed System 8
3.3. Proposed System Model 9
4. Proposed System Analysis and Design 10
4.1. Introduction 10
4.2. Requirement Analysis 10
4.2.1. Functional Requirements 10
4.2.1.1. Product Perspective 10
4.2.1.2. Product features 11
4.2.1.3. User characteristics 11
4.2.1.4. Assumption & Dependencies 11
4.2.1.5. Domain Requirements 11
4.2.1.6. User Requirements 12
4.2.2. Non Functional Requirements 12
4.2.2.1. Product Requirements 12
4.2.2.1.1. Efficiency (in terms of Time and Space) 12
4.2.2.1.2. Reliability 13

4.2.2.1.3. Portability 13
4.2.2.1.4. Usability 13
4.2.2.2. Organizational Requirements 13
4.2.2.2.1. Implementation Requirements 13
4.2.2.2.2. Engineering Standard Requirements 13
4.2.2.3. Operational Requirements 14

● Economic 14

● Environmental 14

● Social 14

● Political 14

● Ethical 14

● Health and Safety 15

● Sustainability 15

● Legality 15

● Inspectability 15
4.2.3. System Requirements 15
4.2.3.1. H/W Requirements 15
4.2.3.2. S/W Requirements 15
5. Results and Discussion 16
6. Conclusion and Future Developments 22
7. References 22
APPENDIX A
List of Figures

Figure No. Title Page No.


1 System Design of proposed 8
system
2 ER Diagram of proposed 9
system
3 KNN algorithm 16
4 Decision tree regressor 17

5 Random Forest 18

6 Frontend of web 19
application
7 Source options 19

8 Choosing Destination 20
option

9 20
Selecting departure date

10 Choosing the Airlines 21

11 Final Predicted Price 21


List of Tables

Table No. Title Page No.


1 Literature Survey 2
2 Gaps and Summary of Papers 4
3 Time and Space Complexity 12
List of Abbreviations

ML Machine Learning
RF Random Forest
DT Decision Tree
KNN K-Nearest Neighbours
MAE Mean Absolute Error
MSE Mean Squared Error
RMSE Root Mean Squared Error
R2 R-squared
Symbols and Notations

∑ Summation
1. INTRODUCTION

1.1. Theoretical Background

The standard method of making reservations has been to simply check the costs of tickets on
booking websites, make a reservation as far in advance as you can, and then cancel
reservations on days when public holidays or festivals fall. The price of a ticket on booking
websites fluctuates throughout the day for the same flight, and the typical user has no idea
when the price drops or increases.Machine learning (ML) is now used in many different areas
of technology for prediction data ,as the idea of using ML for making predictive models has
increased in popularity.

1.2. Motivation

Travelers are increasing dramatically each year as a result of the tourism industry's rapid
expansion. The airline industry is renowned in this sector for its complicated pricing schemes.
However, as flight costs have grown more unpredictable, travelers are looking for the best
bargains even as airlines work to maintain high revenue. Utilizing technology, especially
machine learning methods, can aid in lowering the level of uncertainty associated with flight
prices.

The need to address the current issues with the system for purchasing airline tickets is what
motivated this project. Customers are currently advised to buy tickets far in advance in order
to avoid expensive fees. However, airlines use sophisticated pricing strategies that modify
prices based on a number of variables, leading to fluctuating ticket costs. Because of this,
customers who prepare ahead of time often times pay more than what they should have paid.

1.3. Aim of proposed work

The aim of this project is to create a web application that offers users a one-stop solution for
assistance to air travellers in choosing the date of travel in order to maximize their financial
savings.The objective is to offer a simple and reliable predictive model so that users do not
have to trouble themselves with manually searching for various dates when they want to book
a ticket.

1
1.4. Objectives of Proposed Work

The objective of this project is to create a useful website that combines machine learning with
a fully functional web application for predicting the cost of airline tickets. The other objective
of the project is to assess and compare different machine learning algorithms in order to
determine the most reliable and accurate model for price predictions. The dataset will also be
subjected to thorough classification and analysis using techniques for data cleaning and
encoding. The web application will give users an simple and easy to use interface so they can
choose flight prices based on cheap ticket options, improving their decision-making. Real-time
communication between the front-end elements and the machine learning models will be made
possible, ensuring accurate and current predictions of flight prices. The project's final goal is
to develop a functional platform that combines machine learning's capabilities with user-centric
design, enabling travelers to make smart decisions and enhancing their travel experience.

2. Literature Survey

2.1. Survey of Existing Models/Work

Title Author and Year Description

Airfare prices prediction using Authors: K. Tziridis, Th. this research paper details a set of
machine learning techniques[1] Kalampokas, G. A. Papakostas, features according to which a
K. I. DiamantarasDate: 2017 typical flight ticket may be
25th European Signal Processing affected.these features are
Conference applied to eight ML models and
the accuracy of each was
taken.Bagging Regression Tree
Model showed the best accuracy.
Predicting The Price Of A Flight Author Authors: Supriya in this study, data for a certain air
Ticket With The Use Of Machine Rajankar, Neha Sakharkar, route, comprising features like
Learning Algorithms[2] Omprakash RajankarDate: departure time, arrival time, and
December 2019 airways for a specific
Journal: International Journal of period,have been collected to
Scientific & Technology determine the minimum
ResearchVolume: 8Issue: 12 airfare.machine learning models
have been applied.the paper
concludes by saying that the
results would be more accurate if
the availability of seats was also
there in the data.
Optimal purchase timing in the Authors: J. Santos Domínguez In this study, the authors
airline market[3] Menchero, Javier Rivera, Emilio examined airfares for flights
Torres from Madrid to London,
Journal: Journal of Air Transport Frankfurt,New York, and Paris
ManagementDate: August 2014 during a two-month period,
accounting for up to 30-day in
advance ticket purchases. They
2
discovered that the consumer had
18 days before departure to buy a
ticket without suffering a major
financial loss.
Flight Fare Prediction System Authors: Neel Bhosale, Pranav summary:the researchers found
Using Machine Learning[4] Gole, Hrutuja Handore, Priti that prices of flights are highly
Lakde, Gajanan Arsalwad sensitive to date and time of
Journal: International Journal for departure and if the departure
ResearchDate: May 2022 date is a holiday.
An airfare prediction model for Authors: Viet Hoang Vu, Quang The authors of this paper have
developing markets[5] Tran Minh, Phu H. Phung put forward a new method to
assist buyers in forecasting
Conference: 2018 International airfare prices without relying on
Conference on Information information from airlines. The
Networking research shows that this proposed
model can predict the changes in
airfare prices, even up to the day
of departure, by using publicly
available airfare data found
online. However, it should be
noted that this model may not be
able to take into account certain
important factors, such as the
number of unsold seats on
flights.
A Comparative Study of Authors:Kumar I.,Dogra The authors of this work
Supervised Machine Learning K.,Utreja C. examined the stock market's
Algorithms for Stock Market behavior and chose the best
Trend Prediction[6] Conference:2018 Second model for stock market
International Conference on prediction from a variety of
Inventive Communication and classical machine learning
Computational Technologies. techniques, including Random
Forest (RF), Support Vector
Machine (SVM), Naive Bayes,
and Softmax.
Study on Machine Learning Authors:Vats P.,Samdani K. The authors of this paper propose
Techniques In Financial to make use of text mining and
Markets[7] Conference:2019 IEEE machine learning methods to
International Conference on track public interaction on digital
System, Computation, trading platforms.
Automation and Networking.
Stock Price Prediction Using Authors:Sarode S.,Tolani The authors of this paper propose
Machine Learning Techniques[8] HG.,Kak P.,Life CS. a system that recommends stock
Conference:2019 International purchases to users. The approach
Conference on Intelligent used gets the prediction from
Sustainable Systems. historical as well as real-time
data.
Prediction of Student’s Authors: Sourav Kumar Ghosh The paper investigates first-year
PerformanceUsing Random and Farhatul Janan student performance in a
Forest Classifier[9] renowned university, using
Conference: Proceedings of the random forests to predict
11th Annual International outcomes based on significant
Conference on Industrial factors. Results show 96.88%
Engineering and Operations accuracy in classifying
Management, Singapore, March performance levels and suggest
7-11, 2021 potential improvements by
including additional factors
Random Forest as a Predictive Authors: Lingjun He, Richard A. this paper recommends using
Analytics Alternative to Levine, Juanjuan Fan, Joshua random forest, a machine
Regression in Institutional Beemer, Jeanne Stronach learning algorithm based on
3
Research[10] Conference: Peer-reviewed trees, for institutional research. It
electronic journal. highlights the benefits of random
Journal: Practical Assessment, forests, including their
Research & Evaluation simplicity, low computational
Publication Date: January 2018 expense, high accuracy, and
ISSN: 1531-7714 adaptability, and backs this up
with simulation experiments and
real data analyses.

Table 1. Literature Survey

2.2. Summary/Gaps Identified in the Survey

Title Summary Gaps Identified

Airfare prices prediction using this research paper details a set limited feature selection
machine learning techniques of features according to which a methodology, lack of dataset
typical flight ticket may be details, absence of comparison
affected.these features are with existing models, and a lack
applied to eight ML models and of analysis on interpretability and
the accuracy of each was feature importance.
taken.Bagging Regression Tree
Model showed the best accuracy.

Predicting The Price Of A Flight in this study, data for a certain air While the study considers
Ticket With The Use Of Machine route, comprising features like features like departure time,
Learning Algorithms departure time, arrival time, and arrival time, and airways to
airways for a specific determine minimum airfare, the
period,have been collected to paper concludes that
determine the minimum incorporating seat availability
airfare.machine learning models information would enhance the
have been applied.the paper accuracy of the results.
concludes by saying that the
results would be more accurate if
the availability of seats was also
there in the data.

4
Optimal purchase timing in the In this study, the authors gap identified in this summary is
airline market examined airfares for flights the lack of consideration for
from Madrid to London, fluctuations in airfares within the
Frankfurt,New York, and Paris 18-day window before departure.
during a two-month period, While the study establishes a
accounting for up to 30-day in timeframe for ticket purchases, it
advance ticket purchases. They does not account for potential
discovered that the consumer had price variations during that
18 days before departure to buy a period.
ticket without suffering a major
financial loss.

Flight Fare Prediction System the researchers found that prices gap identified in this summary is
Using Machine Learning of flights are highly sensitive to the absence of consideration for
date and time of departure and if factors beyond the date and time
the departure date is a holiday. of departure, such as the impact
of holidays on flight prices.
While the study recognizes the
sensitivity of prices to departure
date and time, it overlooks other
potential influential factors that
could affect flight prices.

An airfare prediction model for The authors of this paper have limitation of the proposed model
developing markets put forward a new method to to account for certain important
assist buyers in forecasting factors, such as the number of
airfare prices without relying on unsold seats on flights. While the
information from airlines. The model successfully predicts
research shows that this proposed airfare price changes using
model can predict the changes in publicly available data, it
airfare prices, even up to the day overlooks the impact of seat
of departure, by using publicly availability, which can
available airfare data found significantly influence ticket
online. However, it should be prices.
noted that this model may not be
able to take into account certain
important factors, such as the
number of unsold seats on
flights.

The authors of this work the need for determining the best
A Comparative Study of examined the stock market's model among various classical
Supervised Machine Learning behavior and chose the best machine learning techniques for
Algorithms for Stock Market model for stock market accurate stock market trend
Trend Prediction prediction from a variety of prediction
classical machine learning
techniques, including Random
Forest (RF), Support Vector
Machine (SVM), Naive Bayes,
and Softmax.

5
Study on Machine Learning The authors of this paper propose the lack of utilizing text mining
Techniques In Financial Markets to make use of text mining and and machine learning methods to
machine learning methods to track public interaction on digital
track public interaction on digital trading platforms for better
trading platforms. understanding of stock market
dynamics

Stock Price Prediction Using The authors of this paper propose the absence of a system that
Machine Learning Techniques a system that recommends stock combines historical and real-time
purchases to users. The approach data to provide personalized
used gets the prediction from stock purchase recommendations
historical as well as real-time to users
data.

Prediction of Student’s this paper recommends using There is still room for more
PerformanceUsing Random random forest, a machine research because it is unclear
Forest Classifier learning algorithm based on whether the identified eleven
trees, for institutional research. It significant factors fully address
highlights the benefits of random all pertinent factors affecting
forests, including their student performance. Additional
simplicity, low computational study might look into whether
expense, high accuracy, and the suggested model can be
adaptability, and backs this up applied to other colleges or
with simulation experiments and learning environments.
real data analyses.

Random Forest as a Predictive this paper recommends using lack of exploration of


Analytics Alternative to random forest, a machine contemporary data mining
Regression in Institutional learning algorithm based on approaches in institutional
Research trees, for institutional research. It research, limited consideration of
highlights the benefits of random tree-based machine learning
forests, including their algorithms, and insufficient
simplicity, low computational knowledge of the benefits and
expense, high accuracy, and applicability of random forest.
adaptability, and backs this up
with simulation experiments and
real data analyses.

Table 2. Gaps and summaries of papers

6
3. Overview of the Proposed System
3.1. Introduction and Related Concepts
In this project ,I propose to develop a web app which uses HTML and CSS on the frontend and
Flask on the backend where our Machine Learning model is stored using the pickle module.
The model is trained using the random forest algorithm with a dataset which contains a huge
amount of flight data like source,destination,time ,date, airline name etc.The selected .The
web app works by taking in the input from the users and uses this data to provide accurate
predicted price as the result .The web application will provide users with a user friendly
interface to enter the flight data ,the user will be able to select the source , destination , airline
, duration ,time etc on the website.

To ensure the accuracy and reliability of the system, it will be continuously monitored and
evaluated. The performance of the neural network model will be assessed using appropriate
evaluation metrics .

The 3 Machine Learning algorithms which are compared to get the final model are :

• Random Forest Algorithm:

This machine learning algorithm uses a collection of decision trees to perform


classification. To reach a final classification judgment, it incorporates the predictions
of various decision trees. a randomly selected portion of the training data and a A
random subset of the input features is used to construct each decision tree in the random
forest. The method develops decision trees to decrease impurity or increase information
gain by repeatedly segmenting the data according to feature thresholds.

• Decision Tree Algorithm:

The decision tree algorithm is a machine learning technique that bases decisions or
predictions on input features using a tree-like model. By separating the data into
different branches according to the values of different attributes, it creates a tree
structure. In order to maximize information gain or minimize impurity, the algorithm
selects the best characteristic to partition the data at each node. The process iteratively
goes on until a stopping condition is met. The resulting tree provides clear and
understandable decision rules that can be applied to classification or regression issues.

• K-Nearest Neighbours Algorithm:

The K-nearest neighbors (KNN) technique uses a predetermined number of nearest


neighbors to predict the classification of data points based on their closeness to other
labeled data points.

7
3.2. Framework, Architecture or Module for the Proposed System

Fig 1 System Design of proposed system

Data processing steps were applied to both the train and test data. Some of the steps that were
taken are:
• Dropping all NAN values
• Converting the journey date into the format Day - Month - Year
• Dropping the date of the journey and adding new columns for month and year
• Extracting hours and minutes individually from the arrival/departure time
• Checking the counts of stops and removing the route column, replacing 'none' with 0, 'one'
with 1, and so on.

Afterwards, 3 different algorithms were used namely K-Nearest Neighbour , Decision Tree
and Random Forest Algorithm to create models and the one with the best accuracy was
selected. The model was stored using the Pickle module in Python.

Next, The model was tuned by checking Mean Absolute Error ,Mean Squared Error ,Root
Mean Squared Error ,R squared values and by performing cross-validation techniques like
'randomized search cv', etc.

8
Finally, I hosted the model on a server using Flask, a Python framework. This allows our
frontend website to obtain the necessary predictions.

3.3. Proposed System Model

Fig 2 ER Diagram of proposed system

9
4. Proposed System Analysis and Design
4.1. Introduction
The proposed system analysis aims to address the challenges in the airline ticket pricing
system by applying data processing steps and utilizing machine learning techniques. The
system focuses on predicting flight prices accurately using a Random Forest Algorithm and
using the Pickle module in Python to store the trained model. Additionally, the system
incorporates methodologies such as data extraction, data preprocessing, exploratory data
analysis (EDA) and visualization, feature extraction,feature selection, model selection and
implementation.

The data extraction process involves obtaining information from a source, in this case, the
Kaggle database which contained the dataset , to transform raw data into structured and
informative data. Data preprocessing is performed to clean and transform the data, eliminating
outliers, managing missing values and standardizing the dataset for model creation. EDA and
visualization techniques are applied to analyze and summarize the dataset, providing insights
and identifying relationships between different features.

Feature selection is carried out using a heat map to assess collinearity and determine the
variables with strong correlations to target variable and feature_importance_ method and
extratreeregressor to find the best feature . The selected model, Random Forest, is chosen
based on its superior performance compared to other algorithms observed in previous research
papers. Finally, the system's implementation involves developing a web application with a
frontend for user input and a backend that hosts the trained ML model. The frontend collects
user inputs such as departure date, arrival date, and destination, while the backend uses the
ML model to predict the flight ticket prices based on the provided data using flask.

By integrating these methodologies, the proposed system aims to provide accurate and
efficient predictions of flight ticket prices, enhancing the overall customer experience and
assisting travelers in making informed decisions .

4.2. Requirement Analysis

4.2.1. Functional Requirements

4.2.1.1. Product Perspective


The trained model will be hosted on a flask server that will be integrated
with the frontend. The interface of the website is clean to give the user

10
a simple UI experience. The website has a form with simple css styling
and an area where the final data is displayed.

4.2.1.2. Product Features


Some of the features of the web app are:-
● Take input from User
● Simple User Interface
● Different Input boxes for different types of data

4.2.1.3. User Characteristics


The user of this web application can be anybody who wants to book a
ticket or wants to know about the trends in flight ticket prices .This web
app can save them money by helping them decide when to book tickets
by showing the predicted price.

4.2.1.4. Assumption & Dependencies


This Web app assumes that the user is looking for a predicted ticket
price.It is hosted on a website so it also requires an internet connection
to be used.It is also dependent on the user input to show the final result.

4.2.1.5. Domain Requirements


The project is predicated on the use of a Python environment with the
required interpreter and libraries, such as Pandas, NumPy, and
Matplotlib, installed. It expects a dataset in a compatible format with the
necessary columns and data types to be accessible through an API or at
a specific file directory. Steps including handling null values, converting
date/time columns, duration extraction, and categorical data encoding
should be performed on the dataset as part of preprocessing and
encoding. The code takes for granted that there is a separate test set for
assessing the effectiveness of the trained model.

It uses machine learning models. In order to tune the hyperparameters


and improve the performance of the model, certain hyperparameters
must be available. The code takes for granted that it has read and write
access to the local file system or the designated file paths. Depending
on the size of the dataset and the complexity of the algorithms, adequate
computational resources, including memory and processing capacity,
are required.

11
4.2.1.6. User Requirements
The website’s result of predicting the price depends on the input it
receives from users. In other words, if the user provides all the inputs
properly then the website will be able to calculate the results.

4.2.2. Non Functional Requirements


4.2.2.1. Product Requirements

4.2.2.1. Efficiency (in terms of Time and Space)


4.2.2.1.1. Efficiency (in terms of Time and Space)
1. Importing libraries and dataset:
- Time complexity: O(1), Space complexity: O(1)

2. Reading and inspecting the dataset:


- Time complexity: O(n), Space complexity: O(n)

3. Dropping null values:


- Time complexity: O(n), Space complexity: O(1)

4. Converting date and time columns:


- Time complexity: O(n), Space complexity: O(1)

5. Extracting duration hours and minutes:


- Time complexity: O(n), Space complexity: O(1)

6. Encoding categorical data:


- Time complexity: O(n), Space complexity: O(n)

7. Dropping unnecessary columns:


- Time complexity: O(n), Space complexity: O(1)

8. Reading and preprocessing the test set:


- Time complexity: O(m), Space complexity: O(m)

9. Feature selection:
- Time complexity: O(1), Space complexity: O(1)

10. Building machine learning models:


- Time complexity: O(1), Space complexity: O(1)

11. Hyperparameter tuning:


- Time complexity: O(1), Space complexity: O(1)

12. Saving the model:


- Time complexity: O(1), Space complexity: O(1)
Table 3 Time and Space Complexity

12
'n' represents the number of rows in the dataset and 'm' is the
number of rows.

4.2.2.1.2. Reliability
The webapp gives fairly accurate answers ,however
since the dataset that was used for this project is of the
year 2019 and has airline data of that year it may not be
completely reliable for prediction of current flights.

4.2.2.1.3. Usability
The Interface of the website is simple and users
can simply use the app through the frontend
form.

4.2.2.2. Organizational Requirements


4.2.2.2.1. Implementation Requirements
The implementation requirements for deploying this
chatbot are:

- Hardware Infrastructure: Ensure that the web


app can be deployed with the support of an
appropriate hardware infrastructure. This comprises
of networking hardware, servers, storage devices,
GPUs for smooth working.

- Software Frameworks and Libraries: We’ll


need to ensure that python is present in the the
computer .Following libraries must be present in the
user system such as pandas, matplotlib ,numpy etc
.Make sure that the frameworks are compatible with
the hardware infrastructure and can make use of
frameworks that offer deployment optimizations.

4.2.2.2.2. Engineering Standard Requirement


13
The Using machine learning techniques, the web app
uses user input to estimate flight fares with accuracy. It
has a simple user interface that enables users to enter
flight information and get precise price predictions. The
application effectively manages several parallel user
requests and provides quick price results. It is built to
be expandable in the future to handle growing user
traffic because of its scalability. The codebase adheres
to the principles of modular architecture and is well
documented for simple maintenance. The software
interfaces with external services without any problems
and is rigorously tested to guarantee accuracy and
dependability. It also meets with legal and moral
constraints, such as data protection laws and ethical
data handling standards.

4.2.2.3. Operational Requirements


● Economic
The server used to host the web app will incur some cost or the
hardware used can incur some cost.
● Environmental
The web application being accessed online poses no problems
to nature or the environment.
● Social
The web application will have a positive social impact by helping
airline travellers save their money by knowing when to book tickets.
Social awareness about the web application can be helpful .
● Political
The web application does not hold any political beliefs or bias.
● Ethical
The web application does not provide any illegal information. The
predicted result is provided in an ethical manner.
● Health and Safety
The web application does not compromise on any users health and
safety.The web application is safe to use.
14
● Sustainability
The web application does not require a large amount of power or any
powerful hardware ,so it can run on very less energy hence it is
sustainable on the long run.
● Legality
All data accessed by the web application is freely available on the internet so
it's all legally sourced. The web application also does not store any user data.

● Inspectability
The website does not record previous input or past predictions. The
website has some debugging checks to handle when input is not given
properly . The web application does not have any separate testing
environment.

4.2.3. System Requirements

4.2.3.1. H/W Requirements


• Processor: Minimum i3 Dual Core

• Ethernet connection (LAN) OR Wi-Fi

• Hard Drive: Recommended 100 GB or more

• Memory (RAM): Minimum 8 GB

4.2.3.2. S/W Requirements

• Python
• Anaconda
• Jupyter Notebook
• VScode (or any code editor)

The project is made using HTML,CSS on the frontend and FLASK


(a python framework) on the backend. Multiple Python modules were
also used.

15
5. Results and Discussion
The result of this project is a fully functional web app that takes input like source
city,destination city, departure date ,preferred airlines from users and gives a predicted
price for those given data using the machine learning model which is hosted as an
API(Application Programming Interface) in the backend of the Web application using
FLASK framework.

The model is able to give results which is very accurate after hypertuning the Machine
Learning model. The following screenshots show the final results of the project:

Fig 3. K nearest Neighbours Algorithm

16
Fig 4. Decision Tree Regressor

17
Fig 5. Random Forest Algorithm

We see that Random Forest Algorithm gives the best scores among the three algorithms
used so , the model used was created with random forest algorithm .

18
Fig 6. Frontend of the Web Application

Fig 7. source options

19
Fig 8. Destination options

Fig 9. Selecting Departure and arrival date and time

20
Fig 10. Choosing the Airlines

Fig 11. Final Predicted Price

21
6. Conclusion and Future Developments
The goal of this project was to identify a machine learning model that accurately predicts
aircraft ticket prices and to create a fully working web application to use the model. After
experimenting with three algorithms, it was discovered that the Random Forest algorithm
provided the most accurate predictions. The project with proper implementation could help
users save money by giving them knowledge of the patterns that airline ticket prices follow
and the predicted value of the price so that they can choose when to buy tickets. More data
points and historical data of airlines can be included for better accuracy in the future ,There
is a requirement for a service like this in the ticket booking domain so it can be scaled to a
great degree with some more research and a better user interface .

7. References

[1] Tziridis, K., Kalampokas, Th., & Papakostas, G.A. (2017). Airfare Prices Prediction Using
Machine Learning Techniques. In 2017 25th European Signal Processing Conference
(EUSIPCO) . HUMAIN-Lab, Department of Computer and Informatics Engineering, Eastern
Macedonia and Thrace Institute of Technology, Kavala, Greece.

[2] Rajankar, S., Sakharkar, N., & Rajankar, O. (December 2019). Predicting The Price Of A
Flight Ticket With The Use Of Machine Learning Algorithms. International Journal of
Scientific & Technology Research, 8(12).

[3] Domínguez-Menchero, J. S., Rivera, J., & Torres-Manzanera, E. (August 2014). Optimal
purchase timing in the airline market. Journal of Air Transport Management, 40, 137-143.

[4] Bhosale, N., Gole, P., Handore, H., Lakde, P., & Arsalwad, G. (2022). Flight Fare
Prediction System Using Machine Learning. International Journal of Recent Advances in
Science and Engineering (IJRASET), Volume XX, Issue XX, ISSN 2321-9653.

[5] Vu, V. H., Tran, M. Q., & Phung, P. H. (January 2018). An airfare prediction model for
developing markets. In 2018 International Conference on Information Networking (ICOIN) .
DOI: 10.1109/ICOIN.2018.8343221. Ho Chi Minh City University of Technology (HCMUT).

[6] Kumar, I., Dogra, K., Utreja, C., & Yadav, P. (April 2018). Title of the Paper. In
Proceedings of the [Conference Name] . DOI: 10.1109/ICICCT.2018.8473214. Publisher:
IEEE.

22
[7] Vats, P., & Samdani, K. (March 2019). Title of the Paper. In 2019 IEEE International
Conference on System, Computation, Automation and Networking (ICSCAN). DOI:
10.1109/ICSCAN.2019.8878741

[8] Sarode, S., Tolani, H. G., Kak, P., & Lifna, C. S. (2019). Title of the Paper. In 2019
International Conference on Intelligent Sustainable Systems (ICISS). DOI:
10.1109/ISS1.2019.8907958. Palladam, India: IEEE.

[9] Ghosh, S. K., & Janan, F. (2021). Title of the Paper. In Proceedings of the 11th Annual
International Conference on Industrial Engineering and Operations Management. Singapore

[10] He, L., Levine, R. A., Fan, J., Beemer, J., & Stronach, J. (2018). Random Forest as a
Predictive Analytics Alternative to Regression in Institutional Research. Volume 23 Number
1, January 2018, ISSN 1531-7714.

DATASET: https://www.kaggle.com/datasets/nikhilmittal/flight-fare-prediction-mh

APPENDIX A
File: index.html

<DOCTYPE html>
<html lang="en">
<head>

<meta charset="UTF-8" />


<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>AirPred</title>

<link rel="preconnect" href="https://fonts.gstatic.com" />


<link

href="https://fonts.googleapis.com/css2?family=Bebas+Neue&family=Roboto:wght@400;500;700&d
isplay=swap"
rel="stylesheet"
/>

<link rel="stylesheet" href="./static/css/style.css">


</head>

<body>

23
<header class="container">
<nav>
<ul>
<li><a href="#">About </a></li>

</ul>
</nav>
</header>

<main>

<section class="main-content container">


<form action="\predict" method="post">

<div class="booking-form">
<h3>Enter the Data</h3>
<div class="input-group">
<label>Flying From</label>

<select name="Source" id="Source" class="inp-style" type="text" required="required" >


<option value="Delhi">Delhi</option>
<option value="Kolkata">Kolkata</option>
<option value="Mumbai">Mumbai</option>
<option value="Chennai">Chennai</option>
</select>
</div>
<div class="input-group">
<label>Flying To</label>
<select name="Destination" id="Destination" class="inp-style" type="text"
required="required" >
<option value="Cochin">Cochin</option>
<option value="Delhi">Delhi</option>
<option value="New Delhi">New Delhi</option>
<option value="Hyderabad">Hyderabad</option>
<option value="Kolkata">Kolkata</option>
</select>
</div>
<div class="inputs">
<div class="input-group">
<label>Departure Date</label>
<input name="Dep_Time" id="Dep_Time" required="required" class="inp-style"
type="datetime-local" />
</div>
<div class="input-group">
<label>Arrival Date</label>
<input name="Arrival_Time" id="Arrival_Time" required="required" class="inp-style"
type="datetime-local" />
</div>
</div>
<div class="input-group">
<div>
<label>stops</label>
<select name="stops" class="inp-style" type="text" required="required" >
<option value="0">Non-Stop</option>
<option value="1">1</option>
24
<option value="2">2</option>
<option value="3">3</option>
<option value="4">4</option>
</select>

</div>
</div>

<div class="input-group">
<div>
<label>Airline</label>
<select name="airline" id="airline" class="inp-style" type="text" required="required" >
<option value="Jet Airways">Jet Airways</option>
<option value="IndiGo">IndiGo</option>
<option value="Air India">Air India</option>
<option value="Multiple carriers">Multiple carriers</option>
<option value="SpiceJet">SpiceJet</option>
<option value="Vistara">Vistara</option>
<option value="Air Asia">Air Asia</option>
<option value="GoAir">GoAir</option>
<option value="Multiple carriers Premium economy">Multiple carriers Premium economy
</option>
<option value="Jet Airways Business">Jet Airways Business</option>
<option value="Vistara Premium economy">Vistara Premium economy</option>
<option value="Trujet">Trujet</option>
</select>
</div>

</div>
<div class="calculations">

<div class="amount">
<div class="left">
<h4>Predicted Price</h4>
</div>
<div class="right">
<p>Rs <span id="total">{{ prediction_text }}</span></p>
</div>
</div>
</div>
<button type="Submit" value="submit" onclick="" class="btn-style">Show price</button>

</div>

</form>

<div class="booking-content">
<h1>
AirPred :<br />
An Airline Ticket price prediction app
</h1>
<p>
To get the predicted price amount please input
25
all the information
</p>
</div>
</section>

</main>

</body>
</html>

File: style.css

*{
padding: 0;
margin: 0;
box-sizing: border-box;
text-decoration: none;
list-style-type: none;
}
body {
font-family: 'Roboto', sans-serif;
background: url('images/bg.png');
background-repeat: no-repeat;
background-position: center center;
background-attachment: scroll;
background-size: cover;
position: relative;
}

nav {
height: 50px;
width: 100%;
padding: 20px 0;
}
.container {
width: 1200px;
max-width: 85%;
margin: 0 auto;
}
nav ul {
display: flex;
justify-content: flex-end;
margin: 0px 6px;
}
nav ul li a {
color: #fff;
padding: 10px;
text-transform: uppercase;
}
nav ul li a:hover,
a.active {
26
background-color: #3352f2;
border-radius: 5px;
}
.main-content {
display: grid;
grid-template-columns: 2fr 3fr;
gap: 70px;
align-items: center;
}

.booking-form {
background-color: #fff;
border: 1px solid #ddd;
padding: 25px;
border-radius: 5px;
float: left;
}
.booking-form h3 {
margin: 10px 0px;
}
.booking-form .input-group {
margin: 10px 0;
}
.f-class {
display: flex;
justify-content: space-between;
align-items: center;
}
.inp-width {
min-width: 300px;
}
.plus-minus-btn {
padding: 2px;
margin-top: 22px;
word-spacing: 20px;
display: flex;
}
.plus-btn,
.minus-btn {
color: #ffffff;
font-size: 28px;
width: 34px;
height: 34px;
margin: 0 2px;
outline: 0px;
border: 0px;
background: #3352f2;
cursor: pointer;
}
.booking-form .calculations {
margin: 10px 0;
}
.booking-form .calculations p {
font-weight: 500;
margin-bottom: 10px;
27
}
.booking-form input {
width: 100%;
display: block;
padding: 10px;
border-radius: 5px;
border: none;
background: #f3f5ff;
border-radius: 4px;
margin: 10px 0;
}
.booking-form select {
width: 100%;
display: block;
padding: 10px;
border-radius: 5px;
border: none;
background: #f3f5ff;
border-radius: 4px;
margin: 10px 0;
}

.booking-form .inputs {
display: grid;
grid-template-columns: 1fr 1fr;
gap: 10px;
}
.booking-content {
color: #fff;
width: 80%;
}
.booking-content h1 {
font-size: 70px;
text-transform: uppercase;
font-family: 'Bebas Neue', cursive;
font-weight: 400;
margin-bottom: 25px;
}

.btn-style {
display: block;
width: 100%;
padding: 10px;
border-radius: 5px;
border: none;
color: #fff;
font-weight: 500;
background-color: #3352f2;
}
.amount {
display: flex;
justify-content: space-between;
}

.full-amount-area {
28
display: none;
position: absolute;
width: 100%;
height: 100vh;
min-height: 100%;
top: 0;
left: 0;
background-color: #00000067;
}
.full-amount-content {
color: #ffffff;
background-color: #3352f2;
position: absolute;
top: 50%;
left: 50%;
transform: translate(-50%, -50%);
padding: 30px;
box-shadow: 0px 0px 22px #0000008f;
font-size: 20px;
letter-spacing: 1px;
}
.full-amount-content tr td {
padding: 5px 2px;
}
.full-amount-content .hide-btn {
text-align: center;
}
.full-amount-content .hide-btn button {
font-size: 18px;
color: #ffffff;
background: transparent;
padding: 8px;
border: 1px solid #2a4bf5f7;
outline: 0;
cursor: pointer;
margin-top: 15px;
}

File: app.py

@app.route("/")
@cross_origin()
def home():
return render_template("index.html")

@app.route("/predict", methods = ["GET", "POST"])


@cross_origin()
def predict():
29
if request.method == "POST":

date_dep = request.form["Dep_Time"]
Journey_day = int(pd.to_datetime(date_dep, format="%Y-%m-%dT%H:%M").day)
Journey_month = int(pd.to_datetime(date_dep, format ="%Y-%m-%dT%H:%M").month)

Dep_hour = int(pd.to_datetime(date_dep, format ="%Y-%m-%dT%H:%M").hour)


Dep_min = int(pd.to_datetime(date_dep, format ="%Y-%m-%dT%H:%M").minute)

date_arr = request.form["Arrival_Time"]
Arrival_hour = int(pd.to_datetime(date_arr, format ="%Y-%m-%dT%H:%M").hour)
Arrival_min = int(pd.to_datetime(date_arr, format ="%Y-%m-%dT%H:%M").minute)

dur_hour = abs(Arrival_hour - Dep_hour)


dur_min = abs(Arrival_min - Dep_min)

Total_stops = int(request.form["stops"])

airline=request.form['airline']
if(airline=='Jet Airways'):
Jet_Airways = 1
IndiGo = 0
Air_India = 0
Multiple_carriers = 0
SpiceJet = 0
Vistara = 0
GoAir = 0
Multiple_carriers_Premium_economy = 0
Jet_Airways_Business = 0
Vistara_Premium_economy = 0
Trujet = 0

elif (airline=='IndiGo'):
Jet_Airways = 0
IndiGo = 1
Air_India = 0
Multiple_carriers = 0
SpiceJet = 0
Vistara = 0
GoAir = 0
Multiple_carriers_Premium_economy = 0
Jet_Airways_Business = 0
Vistara_Premium_economy = 0
Trujet = 0

elif (airline=='Air India'):


Jet_Airways = 0
IndiGo = 0
Air_India = 1
Multiple_carriers = 0
SpiceJet = 0
Vistara = 0
GoAir = 0
Multiple_carriers_Premium_economy = 0
30
Jet_Airways_Business = 0
Vistara_Premium_economy = 0
Trujet = 0

elif (airline=='Multiple carriers'):


Jet_Airways = 0
IndiGo = 0
Air_India = 0
Multiple_carriers = 1
SpiceJet = 0
Vistara = 0
GoAir = 0
Multiple_carriers_Premium_economy = 0
Jet_Airways_Business = 0
Vistara_Premium_economy = 0
Trujet = 0

elif (airline=='SpiceJet'):
Jet_Airways = 0
IndiGo = 0
Air_India = 0
Multiple_carriers = 0
SpiceJet = 1
Vistara = 0
GoAir = 0
Multiple_carriers_Premium_economy = 0
Jet_Airways_Business = 0
Vistara_Premium_economy = 0
Trujet = 0

elif (airline=='Vistara'):
Jet_Airways = 0
IndiGo = 0
Air_India = 0
Multiple_carriers = 0
SpiceJet = 0
Vistara = 1
GoAir = 0
Multiple_carriers_Premium_economy = 0
Jet_Airways_Business = 0
Vistara_Premium_economy = 0
Trujet = 0

elif (airline=='GoAir'):
Jet_Airways = 0
IndiGo = 0
Air_India = 0
Multiple_carriers = 0
SpiceJet = 0
Vistara = 0
GoAir = 1
Multiple_carriers_Premium_economy = 0
Jet_Airways_Business = 0
Vistara_Premium_economy = 0
Trujet = 0
31
elif (airline=='Multiple carriers Premium economy'):
Jet_Airways = 0
IndiGo = 0
Air_India = 0
Multiple_carriers = 0
SpiceJet = 0
Vistara = 0
GoAir = 0
Multiple_carriers_Premium_economy = 1
Jet_Airways_Business = 0
Vistara_Premium_economy = 0
Trujet = 0

elif (airline=='Jet Airways Business'):


Jet_Airways = 0
IndiGo = 0
Air_India = 0
Multiple_carriers = 0
SpiceJet = 0
Vistara = 0
GoAir = 0
Multiple_carriers_Premium_economy = 0
Jet_Airways_Business = 1
Vistara_Premium_economy = 0
Trujet = 0

elif (airline=='Vistara Premium economy'):


Jet_Airways = 0
IndiGo = 0
Air_India = 0
Multiple_carriers = 0
SpiceJet = 0
Vistara = 0
GoAir = 0
Multiple_carriers_Premium_economy = 0
Jet_Airways_Business = 0
Vistara_Premium_economy = 1
Trujet = 0

elif (airline=='Trujet'):
Jet_Airways = 0
IndiGo = 0
Air_India = 0
Multiple_carriers = 0
SpiceJet = 0
Vistara = 0
GoAir = 0
Multiple_carriers_Premium_economy = 0
Jet_Airways_Business = 0
Vistara_Premium_economy = 0
Trujet = 1

else:
Jet_Airways = 0
32
IndiGo = 0
Air_India = 0
Multiple_carriers = 0
SpiceJet = 0
Vistara = 0
GoAir = 0
Multiple_carriers_Premium_economy = 0
Jet_Airways_Business = 0
Vistara_Premium_economy = 0
Trujet = 0

Source = request.form["Source"]
if (Source == 'Delhi'):
s_Delhi = 1
s_Kolkata = 0
s_Mumbai = 0
s_Chennai = 0

elif (Source == 'Kolkata'):


s_Delhi = 0
s_Kolkata = 1
s_Mumbai = 0
s_Chennai = 0

elif (Source == 'Mumbai'):


s_Delhi = 0
s_Kolkata = 0
s_Mumbai = 1
s_Chennai = 0

elif (Source == 'Chennai'):


s_Delhi = 0
s_Kolkata = 0
s_Mumbai = 0
s_Chennai = 1

else:
s_Delhi = 0
s_Kolkata = 0
s_Mumbai = 0
s_Chennai = 0

Source = request.form["Destination"]
if (Source == 'Cochin'):
d_Cochin = 1
d_Delhi = 0
d_New_Delhi = 0
d_Hyderabad = 0
d_Kolkata = 0

elif (Source == 'Delhi'):


d_Cochin = 0
d_Delhi = 1
d_New_Delhi = 0
33
d_Hyderabad = 0
d_Kolkata = 0

elif (Source == 'New_Delhi'):


d_Cochin = 0
d_Delhi = 0
d_New_Delhi = 1
d_Hyderabad = 0
d_Kolkata = 0

elif (Source == 'Hyderabad'):


d_Cochin = 0
d_Delhi = 0
d_New_Delhi = 0
d_Hyderabad = 1
d_Kolkata = 0

elif (Source == 'Kolkata'):


d_Cochin = 0
d_Delhi = 0
d_New_Delhi = 0
d_Hyderabad = 0
d_Kolkata = 1

else:
d_Cochin = 0
d_Delhi = 0
d_New_Delhi = 0
d_Hyderabad = 0
d_Kolkata = 0

prediction=model.predict([[
Total_stops,
Journey_day,
Journey_month,
Dep_hour,
Dep_min,
Arrival_hour,
Arrival_min,
dur_hour,
dur_min,
Air_India,
GoAir,
IndiGo,
Jet_Airways,
Jet_Airways_Business,
Multiple_carriers,
Multiple_carriers_Premium_economy,
SpiceJet,
Trujet,
Vistara,
Vistara_Premium_economy,
s_Chennai,
34
s_Delhi,
s_Kolkata,
s_Mumbai,
d_Cochin,
d_Delhi,
d_Hyderabad,
d_Kolkata,
d_New_Delhi
]])

output=round(prediction[0],2)

return render_template('index.html',prediction_text=" {}".format(output))

return render_template("index.html")

if __name__ == "__main__":
app.run(debug=True)

35
36
37

You might also like