WEB SCRAPING ON REAL ESTATE
Katikam Radha Krishnaveni
[Link](Computer Science)
Business Statement
Suggesting the best Buildings depending on the customer requirements
requirements.
OBJECTIVE OF THE PROJECT
To do analysis on buildings with respect to their:
I. Property Type
II. Price
III. Area
IV. Rate per [Link]
V. Status
VI. Locality
VII. Builder Name
Website of our project
HOW TO SCRAP THE DATA
Converted into text Applied
-Import required
libraries assigning variable Beautiful Soup
Assign the variable --Requested the Extracted the data
to website URL URL to access the and converted to
data - csv.
DATA FRAME BEFORE CLEANING
DATA CLEANING
Finding weather there are null values.
Finding if there are any empty (or) missing values.
For numerical data null values should be replaced with mean and mode
but here it is replaced with because null values are there for rating so
they are replaced with 0.
Categorical data will be fill with mode.
Converting data to int or float data type as required.
After getting required columns drop the unwanted columns.
Export the data frame into .csv format.
DATA FRAME AFTER SPLITTING & CLEANING
UNIVARIANT ANALYSIS
COUNT PLOT
DISTRIBUTION PLOT
BOX PLOT
H I S TO G R A M
P I E C H A RT
Status of houses
In this plot we can clearly see
that maximum number of
houses are under construction.
Counting on BKH’s
In this plot we can see 3 BHK’s
houses are more .
Comparing on BHK’s
In this plot we can see 5
BHK’s and 4 BHK’s are less
than 10% .
Counting on Property type
In this plot we can clearly see
that there are more
apartments.
Visualization on the Prices
In these plot we are
observe that the price(in
lakhs).
170 price(in lakh) are
having more apartments.
Visualization on rate per sqft(box plot)
In these box plot we are observed
that the rate per sqft is having the
price from 10000 to 12500.
But in these the box plot is 5500
having the mean value.
Here the outliers are from 9900 to
20000.
Visualization on rate per sqft(histogram)
In these we are observe that
most of the houses are
having 7600 rate per sqft.
7600 rate per sqft houses
are around .
BI-VARIANT ANALYSIS
• BAR PLOT
• LINE PLOT
• SCATTER PLOT
• PAIR PLOT
Comparing price and property type
In these bar plot we are observing
that the 5bhk villas are having
more price around 700 .
Here the bar plot is drawn
between property type and
price of the houses.
Comparing price and BHK
In these we have observe that most
of the line plot has starts from 1bhk
and slightly increased to 3bhk
In this the price increases with the
number of rooms
Here the line plot is drawn
betweenthe BHK and price(in
lakhs).
Comparing prices by sellers
In these bar plot we can see
the prices of sellers.
In these bar plot we can see
that M Kartheek group has
highest prices.
Correlation Between area and prices
The graph is drawn between price
and area.
In these scatter plot we are observe
that the price and area are highly
positively correlated.
Positive Correlation : It is the
relation between two variables in
which both variables move in same
direction.
MULTI-VARIANT ANALYSIS
• PA I R P L O T
• BAR PLOT
• BOX PLOT
• R E L AT I O N P L O T
• H E AT M A P
Comparing the property type
In these plot we can see the
comparison of property type
over all columns.
By this plot we can also see the
differences between apartments
,villas..etc.
BHK’s at different location
In these bar plot we have
observe that Manikonda and
Tellapur location has more rate
per sqft.
Here the graph is in between
location and rate per sqft.
Prices at different location
In these box plot we have
observed that max and min
values of the houses in each
location.
Here the graph is drawn
between location and price.
Relation plot
In these plot we have observe
the relation between Area and
rate per sqft of the houses.
Heat map
In these heat max it will
compare the values of
price,,bhk,area,rate per sqft.
CONCLUSION
From the above data we can understand that :
The maximum number of houses are under construction.
Manikonda and Tellapur location has more rate per sqft.
We are observing that the 5bhk villas are having more price around 700.
We can see 3 BHK’s and apartments are more compared to remaining BHK’s
and property type.
The price and area are highly positively correlated
170 price(in lakh) are having more apartments.
M Kartheek group has highest prices.