0% found this document useful (0 votes)

89 views6 pages

Property Rental Price Prediction Using The Extreme Gradient Boosting

This document summarizes a research paper that used the extreme gradient boosting (XGBoost) algorithm to create a predictive model for property rental prices. The paper collected data on listings from Airbnb including property features, neighborhood characteristics, reviews, dates, and host information. It preprocessed the data, selected relevant features, then used XGBoost to build a predictive model. The model was able to predict rental prices with an average RMSE of 10.86, or 13.30% error. The paper concluded that XGBoost is effective for creating property rental price predictions.

Uploaded by

Clara A

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

89 views6 pages

Property Rental Price Prediction Using The Extreme Gradient Boosting

Uploaded by

Clara A

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

International Journal of Informatics and Information System ISSN 2579-7069

vol. 3, No. 2, September 2020, pp. 54-59

Property Rental Price Prediction Using the Extreme Gradient Boosting

Algorithm
Marco Febriadi Kokasih !*, Adi Suryaputra Paramita
'2 Program Studi Sistem Informasi, Universitas Ciputra, Indonesia
! [email protected], 7 [email protected]
“corresponding author

(Received August 28, 2020 Revised September 16, 2020 Accepted September 19, 2020, Available online September 30, 2020)

Abstract

Online marketplace in the field of property renting like Airbnb is growing. Many property owners have begun renting out
their properties to fulfil this demand. Determining a fair price for both property owners and tourists is a challenge.
Therefore, this study aims to create a software that can create a prediction model for property rent price. Variable that
will be used for this study is listing feature, neighbourhood, review, date and host information. Prediction model is
created based on the dataset given by the user and processed with Extreme Gradient Boosting algorithm which then will
be stored in the system. The result of this study is expected to create prediction models for property rent price for
property owners and tourists consideration when considering to rent a property. In conclusion, Extreme Gradient
Boosting algorithm is able to create property rental price prediction with the average of RMSE of 10.86 or 13.30%.

Keywords: Rental Price; Prediction Model; Extreme Gradient Boosting; XGBoost.

1. Introduction
The online marketplace in the property rental sector is growing. One of the platforms from the online
marketplace in the property rental sector is Airbnb which has more than 150 million users, 650 thousand
property owners and more than 6 million properties that have been registered in 2019 [1]. With these figures,
Airbnb has managed to attract attention not only to tourists as an alternative lodging place, but also to property
owners as a source of additional income [12].

The more users on sites like Airbnb, the more things to consider when pricing the properties being offered
[12]. Therefore, determining competitive rental rates is a challenging issue. [9]. This study aims to create a
software that can be used to predict property rental prices based on the given dataset. In this study, the data
used to build the model came from the Inside Airbnb project, a project that collects data from the Airbnb page.
The variables that will be used in this research are house features, environment, reviews, date and property
owner information, according to the variables in the data source. Then the prediction model will be made
using the Extreme Gradient Boosting algorithm. The author chose this algorithm because it has been proven to
have the ability to win various compctitions [3].

2. Theoretical basis

2.1. Data Mining

Data mining is a step used to analyze a knowledge from a database or Knowledge Discovery in Databases [5].
Knowledge in databases can be found after going through the data cleaning process, data integration, data
selection, data transformation, and data mining [4].

2.2. Data Collection

Data collection is the process of collecting data and measuring information about the variables in question in a
systematic way that is well established and allows to answer the questions stated, test hypotheses and evaluate
the results [11].

54
M. Kokasih & A. Paramita / IJIIS vol. 3, no. 2, September 2020, pp 54-59

2.3. Data Preprocessing

Data preprocessing is needed to avoid problems that exist in data before processing such as missing data, data
type errors, inconsistent data and others [10].

a. Correlation Analysis

Correlation analysis is a statistical analysis technique used to find the relationship between two variables [6].
Correlation analysis is used for feature selection process. Feature Selection is a technique used to reduce data
dimensions by selecting relevant features for better learning performance [13].

There are several ways to analyze correlation, namely the theory of Pearson, Spearman and Kendall. Pearson
correlation is used to measure the correlation between two continuous variables. Spearman correlation is used
to assess the relationship between variables which are ordinal data [15]. While Kendall's correlation is used to
measure the correlation between two variables which is ordinal data [6].

b. Data Cleaning

The definition of data cleaning is the process of preparing and sclecting cxisting data through the analysis and
processing of data which can affect the results [10]. The data cleaning process that will be carried out in this
study is as follows.

1. Change data types on {catures.

2. Fill in the empty value with the appropriate valuc.

3. Deleting data that is too different from other data (outliers).

4. Remove features that are not relevant to the Machine Learning process.

5. Converts boolean values into binary numbers.

c. Data Aggregation

Data aggregation is a process in which raw data is collected and summarized in the form of statistical analysis
[8].
d. Data Standardization

Data standardization is the process of giving standards to feature or attribute values so that data does not
interfere with the computer learning process [7].

2.4. Extreme Gradient Boosting

Extreme Gradient Boosting, commonly known as XGBoost, is a development algorithm for Gradicnt Boosting
that is more efficient and scalable [2]. Gradient boosting or gradient boosted tree is one of the algorithms used
in solving supervised learning problems, where training data is used to predict the objective variables [3].
XGBoost provides linear model algorithms and tree learning which are efficient and capable of producing
predictive models [2].

The basic model of XGBoost is a decision tree ensembles, an algorithm that contains a number of regression
trees. The way this algorithm works is by adding the value of each tree for each variable so that it can generate
a mathematical model like this where k is the number of trees, f is a function in functional space F and F is the
set of all trees [3].

= Ta felad, fe EF

55
M. Kokasih & A. Paramita / IJIIS vol. 3, no. 2, September 2020, pp 54-59

That way it can produce general objective functions that need to be optimized

£(8) = UGi,48) + Ue)

1
O(f) = yF + ZAllwll?
In the Extreme Gradient Boosting algorithm, there arc important feature terms that are obtained from the
importance value of cach feature. The more oficn a feature is used to make decisions in a decision tree, the
higher its value. The main calculation of feature importance is the weight which indicates how important a
feature is in creating a new branch. The importance of a feature changes based on how far the predictions
change if a feature is replaced by another feature [14].

3.Method
The research begins with the data collection stage that will be used for research or the data collection stage.
The data used came from the Inside Airbnb project, namely the Singapore Airbnb data taken on September 25,
2019. Based on the variables used in this study, the initial data was taken from the listings, listings-details and
reviews-details dataset. The data is combined into one dataset by doing feature selection first. The variables
contained in the dataset are property features, neighborhood, reviews, date and property owner information.

The data will go through the data preprocessing stage where the data will be prepared to be used in the
machine learning process. The first stage is to combine the dataset into one through the merge process. Data
that has been merged has 107 features and 104725 rows. There are several features that have no effect on
predictions such as id, listing id and reviewer_id that will be deleted. Once deleted, the data will have 53
features. In addition, there is one feature that can be divided into several features such as the amenities feature.
New features are created based on random selection on the value of amenities features which have a total of
85,000, namely Laptop friendly workspace, TV, Microwave, Dishes and Silverware, Hot_water,
Family_kid_friendly.

The feature selection stage is the selection of features from the data that will be used in the study. These
features are selected based on correlation. Features that have a correlation above 0.9 will be removed because
they have a similar effect to the other dependent variables. The final result of the dataset is 47 features.

Data cleaning is the process of removing or replacing values from data that have null values or are outliers. In
the dataset, null data comes from features that describe reviews as well as prices. Then the null data is filled
with a value of 0 and other data that has null data is filled with the average of these features. In addition to
filling in null data, the comments [cature is changed with polarity valucs, all data types are changed to Float,
the date {cature is separated into day, month and year.

Data Standardization is the process of assigning a certain value to each feature dataset. In the dataset, there are
some data that need to be changed in value first. Categorical data is converted into numbers using the Label
Encoding method and then the values will be scparated into new features using the One Hot Encoding method.
Label Encoding is a method used to convert categorical data into integers while One Hot Encoding is used to
separate categorical data with nominal properties into separate features based on its value.

Data Aggregation is the process of converting raw data into a summary form that can be used for further data
analysis. The data will be analyzed to understand the contents of the data in more detail. Then the
feature_importance function of XGBoost will be carried out to find out which features have the highest value
for rental price predictions based on the data provided.

56
M. Kokasih & A. Paramita / IJIIS vol. 3, no. 2, September 2020, pp 54-59

The data that has been prepared is divided into two data frames, namely X and Y which contain independent
variables and contain dependent variables. The two dataframes are then divided into train data and test data
with a ratio of 80:20 using the train_test_split function.

The next step is to create an XGBoost class and prepare the fixed parameters and parameters that will go
through the hyper-paramcter tuning process. Fixed parameter is the carly_stopping_rounds parameter which is
10, eval_metric is RMSE and the test dataset. Parameters that are searched for through the hyper-parameter
tuning process are learning rate, max_depth, gamma, colsample_bytree and n_estimators. Hyper-parameter
tuning is performed using the RandomizedSearchCV function and the results will replace the original model
that was created. The model that has been filled with parameters from the hyper-parameter tuning will go
through the fitting and training process.

To generate a score from the model, 10 fold cross validation was performed using the KFold function. After
that, the resulting model will be saved as a pickle file along with the resulting decision tree. Researchers will
use RMSE to conduct model assessments. In this study, the average RMSE value is expected not to exceed 25.

Based on the results of implementing the XGBoost algorithm on the Airbnb Singapore dataset, it was obtained
a value of 0.94. This value is obtained after the model goes through the cross-validation process between the
train and test data. From this model can visualize 10 features that have Feature Importance value using the
highest weight calculation which can be seen in Figure 1.

Feature importance

number_of_reviews
reviews_per_month
availability_365
minimum _nights
accommodates
Features

availability 30
calculated_host_listings_count
deaning_fee
extra_people

review_scores_rating 3

0 200 400 600 800 1000 1200 1400

F score

Fig. 1. Feature Importance (Top 10)

The model that has been created will be implemented into a website that has two features, namely Predict and
Add Data. The Predict feature functions to predict property rental prices based on predetermined input. The
Prediction Model used will be adjusted based on the location of the property. The Add Data feature serves to
give users the ability to upload their own dataset to produce a prediction model according to the given dataset.
The flow of the Predict and Add Data features can be seen in Figures 2 and 3.

Select Property
location and press

Input property information

and press submit

Prediction Results

Fig. 2. Predict feature usage flow

57
M. Kokasih & A. Paramita / IJIIS vol. 3, no. 2, September 2020, pp 54-59

Enter the dataset and name

the city in the dataset

Press submit

Model Results

Fig. 3. Add Data feature usage flow

4. Results
Based on the model that has been created, tests are carried out in predicting rental prices using data taken from
the testing dataset. The test results can be seen in Table 1.

Table. 1. Test Result

Test Actual Prediction

: : RMSE
Number Prices Prices

Test 1 105 101 3.62 (3.45%)

Test 2 66 73 6.99 (10.59%)

Test 3 45 46 1.37 (3.04%)

Test 4 61 66 4.77 (7.82%)

Test 5 70 99 28.84 (41.2%)

Test 6 200 205 5.08 (2.54%)

Test 7 217 215 2.15 (0.99%)

Test 8 85 118 32.72

(38.49%)

Test 9 184 189 5.59 (3.04%)

Test 10 80 63 17.43
(21.79%)

5. Conclusion
In this paper, we have presented Property rental price prediction model Extreme Gradient Boosting algorithm
is able to create property rental price prediction with the average of RMSE of 10.86 or 13.30%. The highest
RMSE is 38.49% on test 8 and the lowest RMSE is 0.99% on test 7.

58
M. Kokasih & A. Paramita / IJIIS vol. 3, no. 2, September 2020, pp 54-59

References
[1] G. Zervas, D. Proserpio, and J. Byers, “A First Look at Online Reputation on Airbnb, Where Every Stay is Above

Average,” SSRN Electron. J., pp. 1-22, 2018, doi: 10.2139/ssrn.2554500.

[2] T. Chen and T. He, “xgboost: Extreme Gradient Boosting,’ R Lect. no. 2016, pp. 1-84, 2014, doi:

10.1145/2939672.2939785>.This.

[3] T. Chen and C. Guestrin, “XGBoost: A scalable tree boosting system,” Proc. ACM SIGKDD Int. Conf. Knowl.

Discov. Data Min., vol. 13-17-August-2016, pp. 785-794, 2016, doi: 10.1145/2939672.2939785.

[4] L. Markusheski, I. Zdravkoski, and M. Andonovski, “Data Mining Process Ljupce,” Ibaness Congr. Ser. Econ. Bus.

Manag., pp. 71-79, 2019, [Online]. Available: https://(www.researchgate.net/publication/332876172.

[5] T. Hendrickx, B. Cule, P. Meysman, S. Naulaerts, K. Laukens, and B. Goethals, “Mining association rules in graphs
based on frequent cohesive itemsets,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect.

Notes Bioinformatics), vol. 9078, no. 3, pp. 637-648, 2015, doi: 10.1007/978-3-3 19-18032-8 50.

[6] D. R. Hardoon, S. Szedmak, and J. Shawe-taylor, “Canonical correlation analysis ; An methods,” Science (80-. ).,

vol. 16, no. 12, pp. 2639-64, 2003, [Online]. Available: http://www.ncbi.nlm.nih.gov/pubmed/15516276.

[7] M.S. Gal and D. L. Rubinfeld, “Data standardization,” New York Univ. Law Rev., vol. 94, no. 4, pp. 737-770, 2019,

doi: 10.2139/ssrn.3326377.

[8] P. Jesus, C. Baquero, and P. S. Almeida, “A Survey of Distributed Data Aggregation Algorithms,” IEEE Commun.

Surv. Tutorials, vol. 17, no. 1, pp. 381-404, 2015, doi: 10.1109/COMST.2014.2354398.

[9] P. R. Kalehbasti, L. Nikolenko, and H. Rezaei, “Airbnb Price Prediction Using Machine Learning and Sentiment

Analysis,” 2019, [Online]. Available: http://arxiv.org/abs/1907.12665.

[10]S. B. Kotsiantis and D. Kanellopoulos, “Data preprocessing for supervised leaning,” Int. J. ..., vol. 1, no. 2, pp. 1-7,
2006, doi: 10.1080/0233 193 1003692557.
[11]R. B. Davis, S. Ounpuu, D. Tyburski, and J. R. Gage, “Davis_1991.pdf,” Human Movement Science, vol. 10. pp.

575-597, 1991.
[12]E. Tang and K. Sangani, “Neighborhood and Price Prediction for San Francisco Airbnb Listings,” 2015.

[13]C. C. Aggarwal, X. Kong, Q. Gu, J. Han, and P. S. Yu, “Active learning: A survey,” Data Classif. Algorithms AppL.,

pp. 571-605, 2014, doi: 10.1201/b17320.

[14] H. Zheng, J. Yuan, and L. Chen, “Short-Term Load Forecasting Using EMD-LSTM neural networks with a xgboost
algorithm for feature importance evaluation,” Energies, vol. 10, no. 8, 2017, doi: 10.3390/en10081168.

[15]N. H. Trang, “Limitations of Big Data Partitions Technology,” J. Appl. Data Sci., vol. 1, no. 1, pp. 11-19, 2020.

Rev Ajrcos 101262 Ina A
No ratings yet
Rev Ajrcos 101262 Ina A
11 pages
2023 MScIT Patel Mirza
No ratings yet
2023 MScIT Patel Mirza
54 pages
120 GSJ10713
No ratings yet
120 GSJ10713
8 pages
Ids Case Study
No ratings yet
Ids Case Study
15 pages
ML Passing Package - 1
No ratings yet
ML Passing Package - 1
43 pages
Introduction To Data Mining 2005
60% (5)
Introduction To Data Mining 2005
400 pages
Case Study 219302405
No ratings yet
Case Study 219302405
14 pages
Predicting Luxstay Rental Prices
No ratings yet
Predicting Luxstay Rental Prices
44 pages
Vtu17364 Paper Submit Sample
No ratings yet
Vtu17364 Paper Submit Sample
3 pages
House Prices Prediction - Final
No ratings yet
House Prices Prediction - Final
24 pages
House Price Prediction Using Machine Learning and Neural Networks
No ratings yet
House Price Prediction Using Machine Learning and Neural Networks
4 pages
Property Rental Predication
No ratings yet
Property Rental Predication
36 pages
Flight Price Prediction
57% (7)
Flight Price Prediction
19 pages
ADS-ch3 2024-25
No ratings yet
ADS-ch3 2024-25
35 pages
Preface To The Second Edition V 1 1
No ratings yet
Preface To The Second Edition V 1 1
9 pages
Project - Synopsis - Format (1) (1) (1) Copy 2
No ratings yet
Project - Synopsis - Format (1) (1) (1) Copy 2
33 pages
Housepriceprediction ML 221104055342 Fb5109ae
No ratings yet
Housepriceprediction ML 221104055342 Fb5109ae
17 pages
Paper - Xvii Data Mining and Warehousing
No ratings yet
Paper - Xvii Data Mining and Warehousing
140 pages
XGBoost-B-GHM An Ensemble Model With Feature Selec
No ratings yet
XGBoost-B-GHM An Ensemble Model With Feature Selec
26 pages
House Price Forecasting Using Machine Learning Methods: Uter and Mathematics Education 11 (2021), 3624-3632
No ratings yet
House Price Forecasting Using Machine Learning Methods: Uter and Mathematics Education 11 (2021), 3624-3632
9 pages
Real Estate Web Application Using Flask
0% (1)
Real Estate Web Application Using Flask
11 pages
EX1 4HousePricePrediction
No ratings yet
EX1 4HousePricePrediction
6 pages
Predicting Airbnb Listing Price With Different Mod
No ratings yet
Predicting Airbnb Listing Price With Different Mod
8 pages
Final Thesis Yifan Cao
No ratings yet
Final Thesis Yifan Cao
178 pages
Orange3 Data Mining Library Using Python
50% (2)
Orange3 Data Mining Library Using Python
102 pages
Final 1
No ratings yet
Final 1
6 pages
Data Mining - DM 1-5 Question Bank
No ratings yet
Data Mining - DM 1-5 Question Bank
10 pages
Ia1 ML Scheme Common To Is, Ai, Cs
No ratings yet
Ia1 ML Scheme Common To Is, Ai, Cs
10 pages
Easychair Preprint: Vinod Kimbhaune, Harshil Donga, Asutosh Trivedi, Sonam Mahajan and Viraj Mahajan
No ratings yet
Easychair Preprint: Vinod Kimbhaune, Harshil Donga, Asutosh Trivedi, Sonam Mahajan and Viraj Mahajan
5 pages
House Price Prediction 1
No ratings yet
House Price Prediction 1
27 pages
Survey of Classification Techniques in Data Mining: Open Access
No ratings yet
Survey of Classification Techniques in Data Mining: Open Access
10 pages
Analysis and Prediction of Airbnb Listing Prices
No ratings yet
Analysis and Prediction of Airbnb Listing Prices
12 pages
Machine Learnig Revision
No ratings yet
Machine Learnig Revision
93 pages
Orange 3
100% (1)
Orange 3
46 pages
MY PRO DAY 9 Copy
No ratings yet
MY PRO DAY 9 Copy
59 pages
Sat - 149.Pdf - Prediction of Bigmart Sales Using Machine Learning Algorihms
No ratings yet
Sat - 149.Pdf - Prediction of Bigmart Sales Using Machine Learning Algorihms
11 pages
Unit-4 Data Mining
No ratings yet
Unit-4 Data Mining
19 pages
Project Presentation On House Price Prediction System: Presented by Name: Simran B Solanki Roll No: 19020
100% (1)
Project Presentation On House Price Prediction System: Presented by Name: Simran B Solanki Roll No: 19020
32 pages
Project
No ratings yet
Project
36 pages
FULLTEXT01
No ratings yet
FULLTEXT01
68 pages
Full Text 02
No ratings yet
Full Text 02
52 pages
Time Series Forecast of Electrical Load Based On XGBoost
No ratings yet
Time Series Forecast of Electrical Load Based On XGBoost
10 pages
Ensemble Machine Learning Approach
No ratings yet
Ensemble Machine Learning Approach
13 pages
Accelerated Data Science Introduction To Machine Learning Algorithms
No ratings yet
Accelerated Data Science Introduction To Machine Learning Algorithms
37 pages
Predictive Analysis of Taxi Fare Using M
No ratings yet
Predictive Analysis of Taxi Fare Using M
6 pages
On Unit-3
No ratings yet
On Unit-3
30 pages
Report
No ratings yet
Report
36 pages
Uber Data Analysis
100% (4)
Uber Data Analysis
37 pages
Informe IEEE Hose Prices MatangoR PeraltaE
No ratings yet
Informe IEEE Hose Prices MatangoR PeraltaE
4 pages
Big Data Hotel Clustering Guide
No ratings yet
Big Data Hotel Clustering Guide
81 pages
Real Estate Price Prediction Model
No ratings yet
Real Estate Price Prediction Model
3 pages
RapidMiner Minibook
No ratings yet
RapidMiner Minibook
121 pages
ML Project CLG
No ratings yet
ML Project CLG
62 pages
Report-Machine-Learning-101 - 1 36
No ratings yet
Report-Machine-Learning-101 - 1 36
1 page
Research Paper v5.1
No ratings yet
Research Paper v5.1
55 pages
R Companion Data Mining
No ratings yet
R Companion Data Mining
370 pages
Paper 8675
No ratings yet
Paper 8675
6 pages
Uber Data Analysis
No ratings yet
Uber Data Analysis
22 pages
PGDE Educational Psychology Guide
No ratings yet
PGDE Educational Psychology Guide
36 pages
Research Paradigms & Language Study
No ratings yet
Research Paradigms & Language Study
3 pages
Gazette Inter S21
No ratings yet
Gazette Inter S21
103 pages
WLP PPG
No ratings yet
WLP PPG
20 pages
Course Syllabus - Theory & Practice - 2023
100% (1)
Course Syllabus - Theory & Practice - 2023
11 pages
Issues and Questions in History
No ratings yet
Issues and Questions in History
14 pages
Adulthood 10
No ratings yet
Adulthood 10
13 pages
Role of AI in Education
No ratings yet
Role of AI in Education
9 pages
The Analysis of The Relationship Among Emotional Intelligence, Organizational Deviance, Quality of Work Life and Turnover Intentions in Hospitality Business
No ratings yet
The Analysis of The Relationship Among Emotional Intelligence, Organizational Deviance, Quality of Work Life and Turnover Intentions in Hospitality Business
3 pages
CH 2 Student's
No ratings yet
CH 2 Student's
26 pages
The NSTP
No ratings yet
The NSTP
2 pages
The Effects of Job Satisfaction of Employees in Fa
No ratings yet
The Effects of Job Satisfaction of Employees in Fa
14 pages
Social Sciences Lesson Plan
No ratings yet
Social Sciences Lesson Plan
4 pages
PRMSU Vision and Mission
No ratings yet
PRMSU Vision and Mission
38 pages
561-Article Text-2700-1-10-20230805
No ratings yet
561-Article Text-2700-1-10-20230805
9 pages
Deep Learning Insights for Students
No ratings yet
Deep Learning Insights for Students
18 pages
Algorithms: K Nearest Neighbors
No ratings yet
Algorithms: K Nearest Neighbors
16 pages
Exploring Portrayals of Eritrean Women in Mass Media
No ratings yet
Exploring Portrayals of Eritrean Women in Mass Media
11 pages
Ignou Sign Xerox
No ratings yet
Ignou Sign Xerox
5 pages
Abstract Sleep For Success 2015
No ratings yet
Abstract Sleep For Success 2015
1 page
Student Netiquette Impact Survey
No ratings yet
Student Netiquette Impact Survey
6 pages
Theories of Morality Chart
No ratings yet
Theories of Morality Chart
1 page
Organizational Behavior Lesson Plan
No ratings yet
Organizational Behavior Lesson Plan
4 pages
Lie Detection for Criminology Students
No ratings yet
Lie Detection for Criminology Students
9 pages
Critical Reflection Paper
No ratings yet
Critical Reflection Paper
4 pages
Fundamental Components of The Gameplay Experience: Analysing Immersion
No ratings yet
Fundamental Components of The Gameplay Experience: Analysing Immersion
28 pages
Common Traits of Engineers
No ratings yet
Common Traits of Engineers
7 pages
Techniques and Exercises Used in Team Building
0% (2)
Techniques and Exercises Used in Team Building
32 pages
Date Sheet For The Examination of BS AD 5th Semester Fall Semester 20243545
No ratings yet
Date Sheet For The Examination of BS AD 5th Semester Fall Semester 20243545
3 pages
Machine Learning: Usman Roshan Dept. of Computer Science Njit
No ratings yet
Machine Learning: Usman Roshan Dept. of Computer Science Njit
14 pages

Property Rental Price Prediction Using The Extreme Gradient Boosting

Uploaded by

Property Rental Price Prediction Using The Extreme Gradient Boosting

Uploaded by

International Journal of Informatics and Information System ISSN 2579-7069

vol. 3, No. 2, September 2020, pp. 54-59

Property Rental Price Prediction Using the Extreme Gradient Boosting

Keywords: Rental Price; Prediction Model; Extreme Gradient Boosting; XGBoost.

2.1. Data Mining

2.2. Data Collection

2.3. Data Preprocessing

1. Change data types on {catures.

2. Fill in the empty value with the appropriate valuc.

3. Deleting data that is too different from other data (outliers).

5. Converts boolean values into binary numbers.

2.4. Extreme Gradient Boosting

£(8) = UGi,48) + Ue)

0 200 400 600 800 1000 1200 1400

Fig. 1. Feature Importance (Top 10)

Input property information

Fig. 2. Predict feature usage flow

Enter the dataset and name

Fig. 3. Add Data feature usage flow

Table. 1. Test Result

Test Actual Prediction

Test 1 105 101 3.62 (3.45%)

Test 2 66 73 6.99 (10.59%)

Test 3 45 46 1.37 (3.04%)

Test 4 61 66 4.77 (7.82%)

Test 5 70 99 28.84 (41.2%)

Test 6 200 205 5.08 (2.54%)

Test 7 217 215 2.15 (0.99%)

Test 8 85 118 32.72

Test 9 184 189 5.59 (3.04%)

Average,” SSRN Electron. J., pp. 1-22, 2018, doi: 10.2139/ssrn.2554500.

Manag., pp. 71-79, 2019, [Online]. Available: https://(www.researchgate.net/publication/332876172.

Analysis,” 2019, [Online]. Available: http://arxiv.org/abs/1907.12665.

pp. 571-605, 2014, doi: 10.1201/b17320.

You might also like