Ita5007 Da2

Uploaded by

Mśď ŃàŃdy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

Ita5007 Da2

Uploaded by

Mśď ŃàŃdy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 8

Download a un-preprocessed dataset and perform some data cleaning

processes using Python / R that you have learnt in the subject.

Dataset: House Price Dataset of Bengaluru City
1. Import Libraries: In Python, you'll typically use libraries like pandas, numpy,
and matplotlib/seaborn for data analysis and visualization. Start by importing
these libraries.
2. Load the Dataset: Use pandas to read the dataset into a DataFrame.
3. Explore the Data: Use functions like df.head(), df.info(), and df.describe() to
get an overview of the dataset, including missing values.
4. Removing unwanted attributes: Here some attributes are not needed so it can
be removed.
5. Handle Missing Values: Identify and handle missing values using methods like
df.isnull(), df.dropna(), or df.fillna().
6. Feature Engineering: Create new features or modify existing ones to improve
model performance.
 Here creates a new column named bhk
 changes the total_sqft into integer where some of it are in string.

 Adding a another column named price per sqft

7. Handling Outliers: Identify and handle outliers using statistical methods or

visualization.
 Here, Removing the location stats which are less than 10
 Removing the data which are all considered outliers, here we take approximately 300
square ft per bedroom and remove the samples which do not satisfies this logic.

 we are finding mean and one standard deviation for every location and removing
anything that lies below mean and one standard deviation, and removing anything
lies above mean and one standard deviation. In this dataset, there are many samples
in which, the price of 3 bedroom apartment is lesser than 2 bedroom apartment with
similar square feet. There may be many reasons for that, but this outlier can affect
the accuracy of the prediction. So, for this, we will calculate the mean, standard
deviation and count for one and two bedroom apartment and filter out the two
bedroom apartments, who’s price is less than the mean of one bedroom apartment.

Before and after outlier removal in RAJAJI NAGAR.

 The majority of the price lies between 0 to 10000, the dataset is in normal
distribution.
In this dataset, there are some samples which have more than 10 bathrooms. If the samples
have same number of bedrooms, the bathroom count is correct, it is not an outlier. So, what
we do is, if the bathroom exceeds the bedroom by 2 or more numbers, those samples are
considered as outliers and removed from the dataset.
Outliers of bathroom features
8. Visualize the cleaned data: Here is the one of the example of visualization of
the data.

9. After that build the machine learning model which is opt to this kind of data.
for this data random forest, linear regression can be used.

Reading Group Guide
100% (2)
Reading Group Guide
4 pages
House Price Prediction: Project Description
No ratings yet
House Price Prediction: Project Description
11 pages
Faisal Nadeem (SAP# 30601)
No ratings yet
Faisal Nadeem (SAP# 30601)
7 pages
Bi El
No ratings yet
Bi El
26 pages
Predicting Home Prices in Bangalore
No ratings yet
Predicting Home Prices in Bangalore
18 pages
Housepriceprediction ML 221104055342 Fb5109ae
No ratings yet
Housepriceprediction ML 221104055342 Fb5109ae
17 pages
Linear Regression Analysis - Polynomial Regression
No ratings yet
Linear Regression Analysis - Polynomial Regression
25 pages
The Data Science Process
100% (1)
The Data Science Process
53 pages
ds_ml__house_price_book
No ratings yet
ds_ml__house_price_book
46 pages
Boston Housing Solutions
No ratings yet
Boston Housing Solutions
3 pages
Greek Property Prices
No ratings yet
Greek Property Prices
16 pages
Dawit House
No ratings yet
Dawit House
49 pages
T2_summary_VHA
No ratings yet
T2_summary_VHA
14 pages
House Price Prediction Models
No ratings yet
House Price Prediction Models
16 pages
module_2
No ratings yet
module_2
35 pages
AIMLlatestmodule 2Notes Removed
No ratings yet
AIMLlatestmodule 2Notes Removed
33 pages
USA Real Estate Price Prediction Using Decision Tree Regressor, and AdaBoost Regressor
No ratings yet
USA Real Estate Price Prediction Using Decision Tree Regressor, and AdaBoost Regressor
14 pages
Ml Lab Manual
No ratings yet
Ml Lab Manual
60 pages
Ass 3 - Best (2)
No ratings yet
Ass 3 - Best (2)
10 pages
Data Clearning
No ratings yet
Data Clearning
7 pages
ISMLA_Module5
No ratings yet
ISMLA_Module5
25 pages
California Housing Project
No ratings yet
California Housing Project
5 pages
Rajasri
No ratings yet
Rajasri
10 pages
Final
No ratings yet
Final
14 pages
Kirubavathi
No ratings yet
Kirubavathi
10 pages
Module 2notes
No ratings yet
Module 2notes
44 pages
End To End Machine Learning Project-2
No ratings yet
End To End Machine Learning Project-2
10 pages
Machine Learning Lab - Preprocessing
No ratings yet
Machine Learning Lab - Preprocessing
13 pages
Python Expert
No ratings yet
Python Expert
10 pages
00 Data Wrangling
No ratings yet
00 Data Wrangling
10 pages
Project PDF
No ratings yet
Project PDF
13 pages
Introduction To Machine Learning (ML) With Sklearn
No ratings yet
Introduction To Machine Learning (ML) With Sklearn
10 pages
DataPreparation - Outlier - Treatment ASSIGEMENT ANSWER
No ratings yet
DataPreparation - Outlier - Treatment ASSIGEMENT ANSWER
4 pages
Module 2
No ratings yet
Module 2
20 pages
Faseeh Chap 2 Report
No ratings yet
Faseeh Chap 2 Report
30 pages
Machine Learning(BCSL606) Lab Manual
No ratings yet
Machine Learning(BCSL606) Lab Manual
117 pages
MDS372_LAB4_2448001
No ratings yet
MDS372_LAB4_2448001
17 pages
Data Cleaning in Machine Learning With Numerical Example
No ratings yet
Data Cleaning in Machine Learning With Numerical Example
3 pages
Ids Project
No ratings yet
Ids Project
25 pages
Housing Price Prediction
No ratings yet
Housing Price Prediction
25 pages
Housing Prices Notebook
No ratings yet
Housing Prices Notebook
14 pages
Final Defence
No ratings yet
Final Defence
55 pages
1684918425867
No ratings yet
1684918425867
14 pages
The Boston Housing Dataset
100% (2)
The Boston Housing Dataset
4 pages
WQU_Lecon_8_3
No ratings yet
WQU_Lecon_8_3
549 pages
Regression Dataset
No ratings yet
Regression Dataset
3 pages
Data Preparation and Cleaning
No ratings yet
Data Preparation and Cleaning
1 page
07bRegresionLinealBostonVerdConEstandarizacion - Jupyter Notebook
No ratings yet
07bRegresionLinealBostonVerdConEstandarizacion - Jupyter Notebook
17 pages
4 Automatic Outlier Detection Algorithms in Python
No ratings yet
4 Automatic Outlier Detection Algorithms in Python
2 pages
New Opendocument Text
No ratings yet
New Opendocument Text
7 pages
ML 1-11
No ratings yet
ML 1-11
27 pages
B Tech-AIML-question bank-2 Answer Key
No ratings yet
B Tech-AIML-question bank-2 Answer Key
9 pages
Exp 2 Data Preprocessing_ Cleaning the Dataset Obtained from the UCI ML Repository
No ratings yet
Exp 2 Data Preprocessing_ Cleaning the Dataset Obtained from the UCI ML Repository
9 pages
Analysis and Prediction of House Prices by Linear Regression Model
No ratings yet
Analysis and Prediction of House Prices by Linear Regression Model
91 pages
a
No ratings yet
a
2 pages
Informatics Practicals 12th (Personal)
No ratings yet
Informatics Practicals 12th (Personal)
89 pages
ml observation
No ratings yet
ml observation
29 pages
Week 12
No ratings yet
Week 12
2 pages
Data_preprocessing_example_programs1
No ratings yet
Data_preprocessing_example_programs1
9 pages
Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet
Scale Invariant Feature Transform: Unveiling the Power of Scale Invariant Feature Transform in Computer Vision
From Everand
Scale Invariant Feature Transform: Unveiling the Power of Scale Invariant Feature Transform in Computer Vision
Fouad Sabry
No ratings yet
Lessons 3 and 4 - Word Classes PDF
No ratings yet
Lessons 3 and 4 - Word Classes PDF
53 pages
Virgin of Guadalupe Rosary (Chaplet) PDF
100% (1)
Virgin of Guadalupe Rosary (Chaplet) PDF
7 pages
DO374
100% (1)
DO374
100 pages
Class Viii - Term I Examination - 2022-2023 Syllabus and Date Sheet
No ratings yet
Class Viii - Term I Examination - 2022-2023 Syllabus and Date Sheet
2 pages
Lista de Verbos
No ratings yet
Lista de Verbos
2 pages
Perspectives On Magic: Scientific Views On Theatrical Magic
75% (4)
Perspectives On Magic: Scientific Views On Theatrical Magic
24 pages
Descriptive Analytics Spss
No ratings yet
Descriptive Analytics Spss
104 pages
Soma and Amanita Muscaria
100% (1)
Soma and Amanita Muscaria
33 pages
6420B: Fundamentals of Windows Server® 2008 Microsoft® Hyper-V™ Classroom Setup Guide
No ratings yet
6420B: Fundamentals of Windows Server® 2008 Microsoft® Hyper-V™ Classroom Setup Guide
17 pages
CW20T CNC Controller Manual
75% (4)
CW20T CNC Controller Manual
30 pages
Chamba, M. Y., Ramirez-Avila, M. R. (2021) - Word Recognition and Reading Skills To Improve Reading
No ratings yet
Chamba, M. Y., Ramirez-Avila, M. R. (2021) - Word Recognition and Reading Skills To Improve Reading
18 pages
The Imperfect and the Aorist in Greek Analecta Gorgiana 1st Edition C. W. E. Miller - Download the ebook today and own the complete version
100% (2)
The Imperfect and the Aorist in Greek Analecta Gorgiana 1st Edition C. W. E. Miller - Download the ebook today and own the complete version
46 pages
The Sonnet
No ratings yet
The Sonnet
12 pages
NetBackup10001_Logging_RefGd
No ratings yet
NetBackup10001_Logging_RefGd
180 pages
Sap MM BBP
100% (1)
Sap MM BBP
25 pages
The Elves and The Shoemaker Lesson Plans LADYBIRD READERS LEVEL 3
No ratings yet
The Elves and The Shoemaker Lesson Plans LADYBIRD READERS LEVEL 3
2 pages
Chamilo-Example For New Design
No ratings yet
Chamilo-Example For New Design
4 pages
Conjunctions Practice
100% (1)
Conjunctions Practice
1 page
Lecture 2 HCI - Information Systems II - MERISE -
100% (1)
Lecture 2 HCI - Information Systems II - MERISE -
13 pages
A Study On Complements of Zero-Divisor Graphs of Some Algebraic Structures
No ratings yet
A Study On Complements of Zero-Divisor Graphs of Some Algebraic Structures
35 pages
datasheet-axis-q1700-le-license-plate-camera-en-US-443277
No ratings yet
datasheet-axis-q1700-le-license-plate-camera-en-US-443277
3 pages
Rules For Writing @dsatuz
No ratings yet
Rules For Writing @dsatuz
5 pages
How To Script Nipper Studio
No ratings yet
How To Script Nipper Studio
8 pages
DLL-Template_PMES-1
No ratings yet
DLL-Template_PMES-1
2 pages
In The Concept Map Shown Above
No ratings yet
In The Concept Map Shown Above
3 pages
KV Questions and Answers in English Grammar Exercises PDF
100% (1)
KV Questions and Answers in English Grammar Exercises PDF
3 pages
Lesson 3 Constructivist Theory in Teaching Mathematics in The Primary Grades
100% (3)
Lesson 3 Constructivist Theory in Teaching Mathematics in The Primary Grades
3 pages
The Rise of Filipino Mysticism Anting an-1
No ratings yet
The Rise of Filipino Mysticism Anting an-1
44 pages
34927.Tiếng Anh 9 practice 4,5,6,7 thi vao 10
No ratings yet
34927.Tiếng Anh 9 practice 4,5,6,7 thi vao 10
16 pages