
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Flight Price Checker Using Python and Selenium
Web scraping has been a useful technique for extracting data from websites for various purposes, including price checking for airline tickets. In this article, we will explore how to build a flight price checker using Selenium, a popular web testing automation tool. By leveraging Selenium's capabilities, we can automate the process of collecting and comparing prices for flights across different airlines, saving time and effort for users.
Setup
Firefox Executable
Download the Firefox browser installer from here
Once downloaded, install the browser and an exe file will be placed automatically in C:\Program Files\Mozilla Firefox\firefox.exe. We will be needing it later.
Gecko Driver
Windows Users can download the gecko driver from here. For other versions see releases.
Extract the zip and place the "geckodriver.exe" file in C:\ directory. We will be referencing it later in our code.
Selenium Python Package
We are going to be working with the latest version of Selenium Webdriver so pip install the following ?
pip3 install -U selenium pip3 install -U webdriver-manager
Algorithm
Import the necessary libraries - Selenium and time
Set up the Firefox Gecko driver path
Open the website to be scraped
Identify the necessary elements to be scraped
Input the departure and arrival locations and the departure and return dates
Click the search button
Wait for the search results to load
Scrape the prices for the different airlines
Store the data in a format that's easy to read and analyze
Compare the prices and identify the cheapest option
Example
from selenium import webdriver from selenium.webdriver.common.by import By from selenium.webdriver.firefox.options import Options import time # Set Firefox options firefox_options = Options() firefox_options.binary_location = r'C:\Program Files\Mozilla Firefox\firefox.exe' # Initialize webdriver with Firefox driver = webdriver.Firefox(executable_path=r'C:\geckodriver.exe', options=firefox_options) # Set URL and date of travel url = 'https://paytm.com/flights/flightSearch/BBI-Bhubaneshwar/DEL-Delhi/1/0/0/E/2023-04-22' date_of_travel = "2023-04-22" # Print URL print(f"URL: {url}") # Load webpage driver.get(url) # Wait for 5 seconds time.sleep(5) # Find search button and click it search_button = driver.find_element(By.CLASS_NAME, "_3LRd") search_button.click() # Find all elements with class name '_2gMo' prices_elements = driver.find_elements(By.CLASS_NAME, "_2gMo") # Get text of all elements prices_text = [price.text for price in prices_elements] # Convert text to integers prices = [int(p.replace(',', '')) for p in prices_text] # Display the minimum airfare price print(f"Minimum Airfare Price: {min(prices)}") # Display all prices print(f"All prices:\n {prices}")
Output
Minimum Airfare Price: 4471 All prices: [4471, 4472, 4544, 4544, 4679, 4838, 5497, 5497, 5866, 6991, 7969, 8393, 8393, 8393, 8393, 8393, 8445, 8445, 8445, 8445, 8445, 8498, 8498, 8498, 8540, 8898, 8898, 8898, 8898, 8898, 9203, 9207, 9385, 10396, 10554, 10896, 11390, 11433, 11766, 11838, 11838, 11838, 12518, 12678, 12678, 12678, 12735, 12735, 12735, 12735, 12767, 12767, 12787, 12787, 12787, 12787, 12840, 12945, 12966, 12981, 13069, 13145, 13145, 13145, 13145, 13152, 13525, 13537, 13537, 13571, 13610, 13633, 13828, 13956, 14358, 14630, 14630, 14828, 14838, 15198, 15528, 15849, 15954, 16479, 17748, 17748, 18506, 20818, 20818, 20818, 20818, 21992, 23590, 24468, 25483, 25483, 26628, 75271]
Explanation
First, the necessary libraries are imported: webdriver and Options from selenium, By from selenium.webdriver.common.by, and time.
Next, Firefox options are set using Options() and the binary location for Firefox is set to C:\Program Files\Mozilla Firefox\firefox.exe.
A webdriver instance is then created with Firefox using the webdriver.Firefox() function, passing in the path to the Gecko driver executable and the Firefox options.
The webpage is loaded into the browser using driver.get(url).
The script then waits for 5 seconds using time.sleep(5).
The search button on the webpage is found using driver.find_element(By.CLASS_NAME, "_3LRd") and stored in the search_button variable. The click() method is then called on the search_button variable to simulate a click on the button.
All elements on the web page with class name _2gMo are found using driver.find_elements(By.CLASS_NAME, "_2gMo") and stored in the prices_elements list.
The text of all elements in the prices_elements list is extracted using a list comprehension and stored in the prices_text list.
The replace() method is used to remove commas from each element in prices_text and the resulting string is converted to an integer using int(). This is done using another list comprehension and the resulting list of integers is stored in the prices list.
The minimum value in prices is found using the min() function and printed to the console.
Finally, all values in prices are printed to the console.
Application
Using Python and Selenium, this code can be used to begin scraping airfare prices from Paytm's flight search website and hereon, you can modify it to meet specific needs and additional features like storing the scraped data in a file and sending an email notification with a price, among other things.
Conclusion
Selenium is a potent web automation and scraping tool that may be used to collect information from websites without an API. Python's versatility, usability, and robust ecosystem of tools make it the perfect language for scraping. This script shows how to automate browser activities and retrieve data from a webpage with just a few lines of code.