0% found this document useful (0 votes)
21 views16 pages

1 s2.0 S0341816217303909 Main

This study compares the performance of ten advanced machine learning techniques for modeling landslide susceptibility in the Ghaemshahr Region of Iran. The techniques evaluated include Random Forest and Boosted Regression Tree, which demonstrated the best performance with AUC values of 83.7% and 80.7%, respectively. The research aims to enhance disaster preparedness and land management by identifying landslide-prone areas and assessing the importance of various conditioning factors.

Uploaded by

Sourabh Sahu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views16 pages

1 s2.0 S0341816217303909 Main

This study compares the performance of ten advanced machine learning techniques for modeling landslide susceptibility in the Ghaemshahr Region of Iran. The techniques evaluated include Random Forest and Boosted Regression Tree, which demonstrated the best performance with AUC values of 83.7% and 80.7%, respectively. The research aims to enhance disaster preparedness and land management by identifying landslide-prone areas and assessing the importance of various conditioning factors.

Uploaded by

Sourabh Sahu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Catena 162 (2018) 177–192

Contents lists available at ScienceDirect

Catena
journal homepage: www.elsevier.com/locate/catena

Prediction of the landslide susceptibility: Which algorithm, which precision? T


a,⁎ b
Hamid Reza Pourghasemi , Omid Rahmati
a
Department of Natural Resources and Environmental Engineering, College of Agriculture, Shiraz University, Shiraz, Iran
b
Department of Watershed Management, Faculty of Agriculture and Natural Resources, Lorestan University, Lorestan, Iran

A R T I C L E I N F O A B S T R A C T

Keywords: Coupling machine learning algorithms with spatial analytical techniques for landslide susceptibility modeling is
Machine learning a worth considering issue. So, the current research intend to present the first comprehensive comparison among
Landslide spatial modeling the performances of ten advanced machine learning techniques (MLTs) including artificial neural networks
Variables importance (ANNs), boosted regression tree (BRT), classification and regression trees (CART), generalized linear model
GIS
(GLM), generalized additive model (GAM), multivariate adaptive regression splines (MARS), naïve Bayes (NB),
Iran
quadratic discriminant analysis (QDA), random forest (RF), and support vector machines (SVM) for modeling
landslide susceptibility and evaluating the importance of variables in GIS and R open source software. This study
was carried out in the Ghaemshahr Region, Iran. The performance of MLTs has been evaluated using the area
under ROC curve (AUC-ROC) approach. The results showed that AUC values for ten MLTs vary from 62.4 to
83.7%. It has been found that the RF (AUC = 83.7%) and BRT (AUC = 80.7%) have the best performances
comparison to other MLTs.

1. Introduction (Thanh and De Smedt, 2014) require detailed data of geotechnical


engineering and geological aspect of the slope failure at site specific in
In general, landslides as one of the natural earth surface processes regional scale (Tien Bui et al., 2016). These models are quite expensive
and an example of land degradation; it change features of landscape, and not practical for large scale areas (van Westen and Terlien, 1996).
reduces physical extent of the soil of ecosystems (http://www.ccma.vic. Traditional statistical models, which assume an appropriate structural
gov.au/soilhealth/resource/definitions.htm), causes to erosion and se- model and then focus on parameterizing it, are widely used for ana-
diment yield and loss of soil resources (Montanarella, 2003; Keesstra lyzing of natural hazards such as landslides (Yesilnacar and Topal,
et al., 2015), and subsequently damage to houses and basic infra- 2005; Pourghasemi and Kerle, 2016). Classification of each landslide
structures, agricultural lands, and economics and human welfare conditioning factors in traditional statistical models is a key point that
(Corominas et al., 2014; Papathoma-Köhle et al., 2015; Schilirò et al., affects the quality of landslide susceptibility map (Costanzo et al.,
2016). Through destructive impacts of landslides and their con- 2012) and has been deeply discussed in Chacon et al. (2006). In con-
sequences, research institutions and governments have long attempted trast, machine learning techniques, a powerful group of data driven
to delineate landslide susceptible areas for improving disaster pre- tools, use algorithms to learn the relationship between a landslide oc-
paredness and damage prevention (An et al., 2016; Betts et al., 2017). currence and landslide related predictors, and avoids starting with an
Hence, the identification of landslide prone areas not only provides an assumed structural model (Elith et al., 2008; Dickson and Perry, 2016).
insight into control of land degradation; but, can also form a basis for Romer and Ferentinou (2016) stated that to obtain more reliable results
safer strategic planning of future developmental activities in the region through the statistical methods, large amounts of data are required,
(Atkinson and Massari, 1998; Mertens et al., 2016). This process aims to whereas ML-based models can effectively overcome the limitation of
highlight the spatial distribution of potentially unstable slopes based on data dependent bivariate and multivariate statistical methods (Pham
link the past landslide events according to landslide causing variables et al., 2016). Furthermore, other advantages of these models are that no
that are responsible for the occurrence of landslides in a region (Hong statistical assumptions are made, and the nonlinear character of land-
et al., 2015). slides is also considered (Ferentinou and Chalkias, 2013; Zare et al.,
In landslide susceptibility modeling, quantitative approaches are 2013; Kornejady et al., 2017a). These MLTs allow handle data from
grouped into physically based models, statistics based correlation various measurement scales, any type of independent variable (i.e.
analysis, and soft computing techniques. Physically based models ratio, interval, nominal, or ordinal), and without needing to define


Corresponding author.
E-mail address: [email protected] (H.R. Pourghasemi).

https://doi.org/10.1016/j.catena.2017.11.022
Received 30 May 2017; Received in revised form 31 October 2017; Accepted 27 November 2017
Available online 06 December 2017
0341-8162/ © 2017 Elsevier B.V. All rights reserved.
H.R. Pourghasemi, O. Rahmati Catena 162 (2018) 177–192

Table 1
MLTs adopted in the several studies on the landslide susceptibility mapping.

Technique Examples of previous studies

Decision tree (DT) Saito et al. (2009), Nefeslioglu et al. (2010), Pradhan (2013), Tien Bui et al. (2016), Kavzoglu et al. (2014a, 2014b), Wu et al.
(2014)
AdaBoost (AB) Micheletti et al. (2014)
Generalized additive model (GAM) Brenning (2008), Goetz et al. (2011), Vorpah et al. (2012), Goetz et al. (2015)
Artificial neuronal networks (ANN) Lee et al. (2004), Yesilnacar and Topal (2005), Pradhan and Lee (2010), Yilmaz (2010), Zare et al. (2013), Thai Pham et al.
(2015), Arnone et al. (2016)
Adaptive Neuro-Fuzzy Inference System (ANFIS) Pradhan et al. (2010), Oh and Pradhan (2011), Sezer et al. (2011), Nasiri Aghdam et al. (2016)
Random forest (RF) Trigila et al. (2015), Pourghasemi and Kerle (2016), Hong et al. (2016b), Youssef et al. (2016),
Classification and regression trees (CART) Felicísimo et al. (2013), Vorpah et al. (2012)
Multiple adaptive regression splines (MARS) Vorpah et al. (2012), Felicísimo et al. (2013), Conoscenti et al. (2015)
Boosted regression tree (BRT) Dickson and Perry (2016), Youssef et al. (2016)
Maximum entropy (MaxEnt) Felicísimo et al. (2013), Park (2015), Hong et al. (2016a), Kornejady et al. (2017)
Naive Bayes (NB) Tien Bui et al. (2012), Thai Pham et al. (2015), Tsangaratos and Ilia (2016)
Support vector machine (SVM) Yao et al. (2008), Yilmaz (2010), Marjanović et al. (2011), Micheletti (2011), Tien Bui et al. (2012), Pourghasemi et al.
(2013), Hong et al. (2015)
General linear model (GLM) Ayalew et al. (2004), Brenning (2005), Vorpah et al. (2012), Youssef et al. (2016)
Quadratic discriminant analysis (QDA) Rossi et al. (2010)

normally distributed transformed variables (Ferentinou and Chalkias, Caspian Sea (i.e. the largest lake in the world). The study area
2013; Zare et al., 2013; Hong et al., 2016b). Table 1 gives a summary of (Ghaemshahr Region) is located in the eastern part of the Mazandaran
prior literature on the most used different machine learning techniques Province, Iran, between latitudes of 35° 54′ to 36° 26′ N, and longitudes
for landslide susceptibility modeling. However, a comparative study on of 52° 36′ to 53° 15′ E (Fig. 1). The climate of the study area is humid,
landslide susceptibility maps produced by ANN, BRT, CART, GLM, which characterized by warm summer and mild winter (Rodionov,
MARS, NB, QDA, RF, SVM, and GAM has not been commonly en- 1994), with rainfall varying between 600 and 940 mm, and a mean
countered in the literature. For this reason, a comparison among these annual rainfall of 729 mm (Water Resources Company of Mazandaran
relatively new approaches is needed to estimate spatial landslide sus- (WRCM), 2015). It covers an area of about 2241 km2 and its altitude
ceptibility to select the best model for regional analyses. ranges from 30 to 3810 m above sea level (a.s.l.), with an average of
Landslide as a land degradation driving force is one of the major 1218.7 m a.s.l. In winter, the temperature ranges from −1.4 °C to
geo-environmental issues in humid areas, also involving the 15.5 °C, while in summer it varies from 20.9 °C to 32.6 °C (Water
Mazandaran Province in northern Iran (Zare et al., 2013). Significant Resources Company of Mazandaran (WRCM), 2015).
landslide hazard exists in Mazandaran Province with the highest The population of the study area is about 300,000 inhabitants based
landslide activity—due to their steep topography, wet climate, and high on population census data from the Iranian Statistical Institute (ISI)
weathering rates (Choobbasti et al., 2009; Vahidnia et al., 2010; (2016). In 2016, the total area of cultivated land in Ghaemshahr Region
Pourghasemi and Kerle, 2016). Here, this study gives a comprehensive was around 264.5 km2 (11.87%). The period of major deforestation was
comparison of the performance of ten state-of-the-art machine learning between 2006 and 2011, which was due to illegal wood cutting, cul-
models by focusing on their main distinctive characteristics. For this tivation of crops, and residential uses of fuel wood. These forests are
purpose, Ghaemshahr region (in the Mazandaran Province) was se- highly susceptible to fire (i.e. especially during exceptionally dry years)
lected as study area. According to literature, the machine learning which potentially is impressive on runoff, erosion, and landslide pro-
models are new yet in the area of landslide susceptibility assessment cesses (Shafiei et al., 2010).
compared to other methods. The literature review showed that a further From a morphological viewpoint, the southern Coasts of the Caspian
comparative study among different MLTs is needed to better under- Sea have been classified into five zones including Western Gilan,
stand the issue, develop innovative technologies, assessment of the ef- Central Gilan, Central Mazandaran, Western Mazandaran, and Golestan
fectiveness of models, and other assistance for improving the prediction (Khoshravan, 1998, 2007). The study area has been identified as
of landslide susceptibility and land degradation mitigation (Rossi et al., “Central Mazandaran”, which the most of its stratigraphic units are
2010; Goetz et al., 2015; Youssef et al., 2016). Our main objectives are related to Miocene and Quaternary including marl, swamp and marsh,
to: (1) investigating the predictive performance of ten machine learning and dark shale that are susceptible to landslide occurrence (Choobbasti
(ML)-based techniques, including ANN, BRT, CART, GLM, GAM, MARS, et al., 2009). The heavy rainfalls often trigger landslides in the region
NB, QDA, RF, and SVM, (2) comparing the accuracy of models through which are undoubtedly one of the mightiest and most devastating forces
the receiver operating characteristic (ROC) curve method for selection of nature. Therefore, landslides are a major geo-hazard in the study
of the best technique for regional landslide susceptibility modeling in area, where numerous slope failures have occurred in the past decade
the study area, and (3) analyzing the relative importance of landslide and highly likely to occur in the future due to natural or anthropogenic
conditioning factors using LVQ method. This understanding is funda- causes. Most of them are shallow rotational landslides occurring in the
mental for producing accurate and reliable landslide susceptibility last 15 years and along river incisions and/or failure surfaces. They
maps and allows drawing more robust conclusions about capability of have caused damages to infrastructure, buildings, and sources of live-
different MLTs. In addition, the results of this study is a valuable help to lihood earnings. Meanwhile, it should be noted that the population,
local authorities in sustainable land management and combat land economical, and ecological pressures of Ghaemshahr Region brings
degradation and also add some knowledge to understand the relation with it informal settlements and land degradation—due to over grazing
between the geo-environmental variables and distribution of landslides. and deforestation—onto potentially dangerous slopes in the area. In
order to obtain a better visual understanding of landslides in the study
2. Material and methods area, some field photographs are presented in Fig. 2.

2.1. Study area 2.2. Methodology

The Mazandaran Province is situated in southern coast of the The methodological approach carried out in this study is consisted

178
H.R. Pourghasemi, O. Rahmati Catena 162 (2018) 177–192

Fig. 1. Location of the study area (Ghaemshahr Township, Northern Iran), and landslide training and validation groups.

of five main steps (Fig. 3): These steps are explained in the following sections:

i. Landslide inventory mapping and training samples;


2.2.1. Landslide inventory mapping and training samples
ii. Preparing of landslide conditioning factor maps;
The study area is affected by severe landslide events in over a broad
iii. Application of ten advanced MLTs to create landslide susceptibility
area. The landslide dataset provides the useful information for the
maps;
comparative analysis of the different susceptibility models. Therefore, a
iv. Determine of variables importance using LVQ method;
landslide inventory map is very necessary for understanding the re-
v. Validation of the landslide susceptibility maps and selection of the
lationship between the landslide events and the controlling parameters
best model in the study area.
of landslides (Ercanoglu and Gokceoglu, 2004).
Landslide inventory database can be prepared using different

179
H.R. Pourghasemi, O. Rahmati Catena 162 (2018) 177–192

Fig. 2. Methodological flowchart of the study.

methods. Selection of a specific method depends on the purpose of the Fig. 1.


study, the skills and experience of the investigators, the scale of the base
maps, the extent of the study area, resolution and characteristics of the
2.2.2. Preparing of landslide conditioning factor maps
available imagery (e.g., satellite images, aerial photographs), and the
For accurate prediction of landslide-prone areas, landslides related
resources available to complete the work. In this research, a detailed
to spatial database should be prepared. According to the knowledge
and reliable landslide inventory map is recorded by existing reports,
attained from the literature, availability of data, and field surveys the
aerial photographs in 1:20,000-scale, and Google Earth images, and
landslide conditioning factors were chosen (Pradhan and Lee, 2010;
then verified by extensive field surveys. A total of 276 landslide events
Costanzo et al., 2012; Keesstra et al., 2016; Kornejady et al., 2017b).
were identified in the study area. In the current study, the identified
The selection of landslide conditioning factors depends on the char-
landslides were shallow rotational type according to Varnes classifica-
acteristics of the study area, the landslide type, and the scale of the
tion system (1978). The maximum and minimum dimensions of land-
analysis (Tseng et al., 2015). However, there is no agreement on the
slides were 150 and 285,500 m2 and triggered by rainfall events. For
universal guidelines for selecting landslide conditioning factors
this analysis, the landslide inventory map according to randomly
(Pradhan and Lee, 2010). In the present study, the landslide con-
sampling (Lee et al., 2004) was divided into two datasets: 70% (193
ditioning factors were selected among those most commonly used in the
landslide locations) and 30% (83 landslide locations) of the landslides
literature to evaluate landslide susceptibility (Pradhan and Lee, 2010;
for training and validation purposes (Nefeslioglu et al., 2008b), re-
Pourghasemi et al., 2013; Pradhan, 2013; Tien Bui et al., 2016; Hong
spectively. By the way, the same with training and validation data sets,
et al., 2015, 2016a; Pourghasemi and Kerle, 2016). According to pre-
193 and 83 non-landslides were randomly selected from landslide free
vious studies, rainfall and earthquakes are external factors and tem-
areas, respectively. The locations of landslide datasets are shown in
poral phenomena (i.e. sensitive to temporal changes), and also they

180
H.R. Pourghasemi, O. Rahmati Catena 162 (2018) 177–192

Fig. 3. Landslide causal factors: (a) altitude, (b) drainage density, (c) slope aspect, (d) lithology units, (e) land use, and (f) slope degree (see Table 3 for legend of lithology map).

have considered as triggering factors of landslides (Kanungo et al., terms of landslide susceptibility have been used. These ground factors
2006; Pradhan and Lee, 2010; Goetz et al., 2015). Precipitation data in this study are ordinal, nominal, or scale formats, including altitude,
such as rainfall intensity and accumulative precipitation are usually slope angle, slope aspect, slope-length, plan curvature, profile curva-
applied to determine the thresholds of rainfall-induced landslides ture, drainage density, distance from river, distance from faults,
(Weng et al., 2011). Also, the past data on these external factors in landuse, lithology, and distance from roads (Fig. 4(a–l)).
relation to landslide occurrence are not available in the study area; For the landslide susceptibility modeling, the conditioning factors
therefore, these two factors have not been included in the current study. must be converted to a grid dataset with similar resolution. Therefore,
Further, the attributes of the ground (known as internal factors) in these twelve landslide conditioning factors were selected in the present

181
H.R. Pourghasemi, O. Rahmati Catena 162 (2018) 177–192

Fig. 4. Landslide susceptibility maps for each MLT: (a) ANN, (b) BRT, (c) CART, (d) GLM, € GAM, (f) MARS, (g) NB, (h) QDA, (i) RF, and (j) SVM.

study and were standardized to the same size (20 ∗ 20 m) for further 2.2.2.2. Slope angle. Slope angle controls the subsurface flow as well as
analyses. Contour lines and survey base points were extracted from impacts on concentration of the soil moisture which are directly related
topographic maps (1:25,000-scale), and a Digital Elevation Model to the landslide occurrence (Magliulo et al., 2008). Therefore,
(DEM) with a grid size of 20 × 20 m was created. Statistical properties according to literature, one of the most important factors in the slope
of numerical landslide conditioning factors are shown in Table 2. stability analysis and landslide susceptibility mapping is the slope
angle. In the study area, most of landslides occurred in unstable areas
like mountains and hilly ranges that had steep slope (Fig. 4(b)). For this
2.2.2.1. Altitude. Altitude is one of the most topographic factors that reason, the slope angle map of the study area was extracted from DEM.
affect slope instability (Feizizadeh et al., 2014). It determines the
spatial distribution of landslides. In addition, altitude factor plays 2.2.2.3. Slope aspect. The slope aspect can influence on hydrological
important role in precipitation properties and vegetation cover type. processes, evapotranspiration, control the concentration of the soil
Hence, altitude map of the study area was derived from the DEM layer. moisture, vegetation, and root development, and therefore it also has an
The altitude of the study area ranged from 30 to 3810 m above sea level indirect bearing on landslide predisposition and its susceptibility
(Fig. 4(a)). mapping (Neuhäuser et al., 2012). The DEM (20 × 20 m) was used to

182
H.R. Pourghasemi, O. Rahmati Catena 162 (2018) 177–192

Fig. 4. (continued)

Table 2 2.2.2.5. Plan and profile curvatures. Slope shape and terrain
Statistical properties of landslide causal factors. morphology can probably affect the landslide susceptibility in several
ways according to Haigh and Rawat (2012). Curvature is one of the
Factors Min. Max. Mean St. deviation
basic terrain parameters described by Evans (1979), and is commonly
DEM 30 3810 1218.72 759.35 applied in geomorphometrical analyses. Therefore, plan and profile
Slope angle 0 73.13 17.65 8.95 curvatures were extracted from the DEM using ArcGIS 10.2
Drainage density 0 19.10 2.62 2.94 (Fig. 4(e–f)).
Distance from faults 0 8547.85 985.32 1276.16
Slope-length (LS) 0 42.68 8.55 5.82
Plan curvature −16.52 18.37 −0.06 0.38 2.2.2.6. Drainage density. Drainage pattern has been recognized as an
Profile curvature −26.36 43.67 −0.12 0.37 important factor for evaluating landslide proneness. To investigate the
Distance from rivers 0 7185.89 1456.78 1167.29
relative relationship between landslide frequency and drainage
Distance from roads 0 6732.5 1240.45 1167.31
Slope angle 0 73.13 17.65 8.95 patterns, the drainage density was used. Drainage density map of the
study area was prepared using line density analysis tool in ArcGIS 10.2
(Fig. 4(g)).
calculate the slope aspect within the study area. The resulting aspect
map is then reclassified into the nine directions: north (N: 0–22.5°), 2.2.2.7. Distance from rivers. In landslide susceptibility studies,
north-east (NE: 22.5–67.5°), east (E: 67.5–112.5°), south-east (SE; distance from river is considered to be an important conditioning
112.5–157.5°), south (S: 157.5–202.5°), south-west (SW: factor (Pourghasemi et al., 2013). The intermittent flow regime of
202.5–247.5°), west (W: 247.5–292.5°), north-west (NW: hydrologic network, saturation processes, and groundwater recharges,
292.5–337.5°), and flat (F: −1) surfaces (Fig. 4(c)). increasing the pore water pressure that causes landslide initiation in
areas adjacent to rivers (Haigh and Rawat, 2012; Hadji et al., 2013).
2.2.2.4. Slope-length. Besides the slope steepness and slope aspect, This mechanism is reinforced during periods with high precipitation or
there is also another topographic factor called slope-length. The soil snowmelt. Hence, the distance calculation operation in ArcGIS 10.2 is
loss and hydrological processes mainly are affected by the combined carried out using Euclidean Distance tool to derive the distance from
impact of slope-length and slope angle factors. The slope-length (LS) rivers (Fig. 4(h)).
factor (Fig. 4(d)) was calculated by Eq. (1) (Moore and Burch, 1986):

A 0.4 sinβ ⎞1.3 2.2.2.8. Distance from faults. Faults are the tectonic breaks that not
LS = ⎛ s ⎞ ⎛ only decrease the rock strength; but also, usually caused to an intense
⎝ 22.13 ⎠ ⎝ 0.0896 ⎠ (1)
fracturing and unstable slope conditions (Bucci et al., 2016). Therefore,
where, As (m ) is specific catchment area and β is in degree.
2
faults can significantly increase landslide susceptibility (Lee et al.,

183
H.R. Pourghasemi, O. Rahmati Catena 162 (2018) 177–192

2002; Kanungo et al., 2006). The faults map was obtained from databases in an automatic way. It also builds classification and re-
Geological Survey Department of Iran (GSDI), and then distance from gression trees for predicting categorical predictor factors (classification)
faults map was created in ArcGIS 10.2 (Fig. 4(i)). The concentration of and continuous dependent factors (regression) in order to produce ac-
faults is more in the western part of the study area. curate landslide models (Felicísimo et al., 2013).
In the current study, ten machine learning techniques that differ in
2.2.2.9. Landuse. According to literature, landuse has been found to their level of complexity are compared in this study: artificial neural
have a large effect on the slope stabilization and landslide occurrence networks, boosted regression tree, classification and regression trees,
through different mechanisms taking place at the root system of the generalized linear model, generalized additive model, multivariate
vegetation (Hadji et al., 2013; Miller, 2013). In addition, areas with adaptive regression splines, naïve Bayes, QDA, random forest, and
little or no vegetation cover and degraded regions are more predisposed support vector machines. While each of these MLTs could potentially be
to landslides (Gomes et al., 2005). On the other hands, landuse utilized with a variety of settings for model building, the most common
development patterns and land-use planning impact on the soil and configurations of their applications were chosen.
its saturated conditions, which this can help in understanding
prediction and assessment of landslide hazard. The landuse map of 2.2.3.1. Artificial neural networks (ANN). Artificial neural networks
study area was obtained from Iranian Department of Water Resource (ANNs) include a wide range of learning algorithms that have been
Management (IDWRM) (2014). The main landuse types having been developed in statistics and artificial intelligence. Since ANNs are non-
identified were forest, orchard-irrigation agriculture, urban linear classification techniques and also consist of an interconnected
(residential), orchard, forest-orchard, dry-farming, irrigation group of artificial neurons, they have the ability to learn complex
agriculture, and range land classes (Fig. 4(j)). relationships between input and output variables (Fausett, 1994). The
number of neurons in the hidden layer was determined after an
2.2.2.10. Lithology. The lithology is known as an important factor in optimization stage, because it has a great influence on model
the geo-hazard assessment (Ayalew et al., 2004; Rahmati et al., 2016), performance (Kuo et al., 2004). Application of this technique in
because it has a strong impact on rock hardness and expedite the landslide susceptibility assessment has been deeply described in
weathering (Henriques et al., 2015). To account for the geological Poudyal et al. (2010) and Alkhasawneh et al. (2014). The “rminer”
conditions, 6 main lithological groups have been derived by package (Cortez, 2015) was applied for running ANN model by 12-6-1
aggregating the 16 classes of a geological map (1:100,000-scale) network and learning rate of 0.1.
provided by the Geological Survey of Iran. The Ghaemshahr
Township contains various geological formations (Table 3). The 2.2.3.2. Boosted regression tree (BRT). Boosted regression tree (BRT) is
spatial distribution of these lithological groups in the study area is a combination of statistical and machine learning techniques that its
demonstrated in Fig. 4(k). name given from the two algorithms namely boosting and regression
trees. The BRT fits complex non-linear relationships between dependent
2.2.2.11. Distance from roads. The building of mountain roads has a and independent variables by combining the mentioned methods
significant and direct influence on the landslide distribution (Wang (boosting as an additive method to combine many single models to
et al., 2016). Road construction activity in mountain areas is considered improve the model performance and regression trees that relates a
as an infrastructure development, which may cause adverse impacts on response to their predictors by recursive binary splits) (Friedman, 2001;
slope stability; therefore, it can be useful for identifying the prone areas Elith et al., 2008). As described by Schapire (2003), BRT technique has
to landslide occurrence. The Euclidean Distance Tool was used to its origins in machine learning, but can be introduced an advanced form
calculate the distance from road network (Fig. 4(l)). of regression. In contrast to standard regression (SR) techniques that
generate a single predictive model, BRT fits multiple simple models and
2.2.3. Application of machine learning techniques combine them for spatial prediction, thereby improving predictive
Developers are finding that machine learning is good for many as- performance of landslide susceptibility maps (LSMs). As discussed in
pects of solving spatial modeling problems in the domain of landslide Elith et al. (2008), the BRT does not need prior data transformation or
susceptibility assessment, but they also suggest that it should be applied removal of outliers, and can fit complex non-linear relationships
as a supportive tool, rather than attempting to replace human expertise between inputs and outputs, and also automatically address
(Marjanović et al., 2011). When machine learning algorithm is com- interaction effects between variables. Therefore, the advanced
bined with the processing capabilities of GIS and field dataset (i.e. application of the BRT technique is to model natural phenomena with
historical landslide inventory and geo-environmental variables), it non-linear relationships (Youssef et al., 2016). In the current study, BRT
promises to provide earth scientists and planners with new modeling modeling was carried out in R statistical programming language
and analytical capabilities (Goetz et al., 2015). Additionally, the ma- software (R Development Core Team, 2015), using “gbm”
chine learning not only able to adjust its internal structure to the ex- (Generalized Boosted Model) and “dismo” (Species Distribution
isting landslide data but also allows extracting knowledge from large Modeling) packages (Ridgeway, 2007; Hijmans et al., 2016).
Furthermore, the learning rate, bag fraction, and maximum number
Table 3 of trees were adapted to reduce prediction uncertainty. Additionally,
Lithology of Ghaemshahr Township, Iran. we assessed the relative importance of predictor variables, and their
behaviors/responses the BRT model.
Class Code Lithology Area (%)

Group 1 Q, Qmf, Qal, Qs, Qtr, Qf, High/low level piedmont fan, 7.27 2.2.3.3. Classification and regression trees (CART). Classification and
Q1, Q2, Qc valley terrace deposits regression trees (CART) is a rule-based procedure that creates a binary
Group 2 J1, Js2, Jssl, Jssh, Jsv, Jss, Dark shale and sandstone 52.16 tree using binary recursive partitioning (BRP), a nonparametric
TR3Js1, TR3Js, P€k
Group 3 PLm,c, MPLm,c, M2,3m,s,l, Marl, shale, Swamp and marsh, 25
nonlinear technique that splits the data into subsets on the basis of
OM1m,c, Pn Marl conglomerate available independent factors. In the other words, BRP is a process that
Group 4 OM1m,g, EOm,t,v, Ev, bf Andesite basaltic volcanic 12 divides a node into yes/no answers (child nodes) as predictor values.
Group 5 K1l,m, Kt, Jl2, Jl1 TRe2, Dolomite: thick bedded 2.62 CART model can be used to generate both regression trees and
TRe1, Pr, P€s
classification trees. If the response variable (i.e. target value) is
Group 6 Em, PeEz, Pem,l,s, K2m,l, Limestone with dolomite, with 0.95
K2l,m, Jd, TR3v,t, Pn, Cm marl continuous, CART generates a regression tree; when the response
variable (target value) is categorical, CART generates a classification

184
H.R. Pourghasemi, O. Rahmati Catena 162 (2018) 177–192

tree (Samadi et al., 2014). The mentioned method runs according to also used to build a predictive model based on groups that consider
“rpart” (Recursive Partitioning and Regression Trees) statistical statistical relationships between predictors and the presence landslides.
package in R 3.0.2. For landslide susceptibility mapping using QDA method used of
“rminer” package (Cortez, 2015) in R 3.0.2.
2.2.3.4. Generalized linear model (GLM). The generalized linear model
(GLM) with a logistic regression model is the most common statistical 2.2.3.9. Random forest (RF). Random forests (RF), was first introduced
method for prediction of landslide susceptibility (Brenning, 2005). It by Breiman (2001), belong to the family of ensemble methods. The RF
was first used for landslide susceptibility modeling by Atkinson and algorithm is an important modification of bootstrap aggregating (or
Massari (1998). GLMs contain of an additive compound of single bagging), and have been proposed in literature as a powerful tool for
parametric terms, each representing a linear function of a single both classification and regression problems (Hastie et al., 2009; Trigila
independent variable. For more details, refer to McCullagh and et al., 2015; Naghibi et al., 2016; Youssef et al., 2016). The RF
Nelder (1989). Due to their ability, GLMs allow for a wide variety of algorithm process is performed in two major steps: First, RF builds
statistical approaches to analyze relationships in the data (Vorpah et al., multiple bootstrap samples, which known as training sets and
2012). Here, GLM was performed using forward–backward stepwise constructs a classification rule (a tree) for each. During bootstrapping
variable selection based on the Akaike Information Criterion (AIK) in phase, some observations which are not used during tree construction
“glm” package. will be “left out” of the training set. These “left out” observations create
a test set (called the out-of-bag (OOB) samples), which is applied to
2.2.3.5. Generalized additive model (GAM). Like GLM, Generalized assess misclassification error and to estimate expected predictive
additive model (GAM) algorithm has recently been applied for accuracy. The OOB error rate is considered an unbiased estimate of
assessing the landslide susceptibility, which uses smoother to fit the generalization error rate. Second, when the tree is growing, at each
nonlinear functions (Goetz et al., 2011). The GAM is a semi- node a given number of variables (denoted by mtry) are randomly
parametric extension of the GLM model that can combine linear and chosen from among all input p variables taking the best split among
nonlinear relationships between landslide conditioning factors (i.e. them (Genuer et al., 2010). This random feature selection at each node
predictors) and landslide occurrence (i.e. response variables) (Hastie decreases the correlation between any pair of trees in the forest; thus,
and Tibshirani, 1990). Furthermore, since GAM model estimate the decreasing the forest error rate. In few words, RF includes two powerful
predictor's partial response curves with a non-parametric smoothing ideas in machine learning algorithms: random feature selection and
function instead of parametric function, this allows development of bagging (Wu et al., 2014). Detailed statistical explanation on RF
statistical relationships between landslide occurrences and landslide technique is given in Breiman (2001), Liaw and Wiener (2002), and
factors, providing descriptions of geo-environmental patterns. In this Segal (2004). In this study, ‘randomForest’ package (Briman and Cutler,
study, for running GAM algorithm, “GRASP” (Generalized Regression 2015) was applied in R 3.0.2 software to model landslide susceptibility.
Analysis and Spatial Prediction) package we used (Lehmann et al.,
2002) in R 2.0.7.
2.2.3.10. Support vector machines (SVM). Support vector machine is a
supervised classifier based on the analysis of Structural Risk
2.2.3.6. Multivariate adaptive regression splines (MARS). Multivariate
Minimization (SRM) that can be used for classification and regression
adaptive regression splines (MARS) is a relatively new machine
(Vapnik, 2000). SVMs can be applied either for more complex non-
learning procedure for non-parametric regression analysis, useful for
linear situations or for linear separable problems due to its potential for
addressing complex and non-linear relationships between response and
producing complex curved boundaries (Broséus et al., 2011). Instead of
predictor variables (Briand et al., 2004). It has been used to various
classifiers like partial least squares discriminant analysis (PLS-DA), and
fields such as gully erosion (Gómez-Gutiérrez et al., 2009a, 2009b;
linear discriminant analysis (LDA) which create boundaries using the
Gómez-Gutiérrez et al., 2015), landslide hazard assessment (Vorpah
structure of the dataset—through size and shape of each class—a
et al., 2012; Felicísimo et al., 2013), and environmental sciences
peculiar characteristic of SVM technique is that data points which lie
(Leathwick et al., 2005). The MARS technique was implemented in R
on the support vectors (SV) determine the solution of the classification
3.0.2 using the “earth” package (Milborrow et al., 2011). Details of the
problem (Clarke et al., 2009; Hastie et al., 2009). The prediction
model and its application in landslide susceptibility assessment are
accuracy of a SVM is affected by the selection of the kernel functions
described in Conoscenti et al. (2015).
such as sigmoid, polynomial, linear, and radial basis function (RBF).
The kernel function of RBF, which is defined based on the Euclidean
2.2.3.7. Naïve Bayes (NB). Naïve Bayesian (NB) is a machine learning
Distance, is the most commonly used kernel function for landslide
algorithm to statistical analysis based on the Bayes's law, which works
susceptibility assessment (Marjanović et al., 2011; Tien Bui et al.,
under a conditional independence assumption of variables (Soria et al.,
2012). Therefore, RBF kernel (Eq. (2)) was employed in this study.
2011). The main advantage of the NB classifier is that it is very easy for
modeling without needing any complicated iterative parameter K (Xi , Xj ) = exp (−γ ‖Xi − Xj ‖2 ) (2)
estimation schemes (Wu et al., 2008); therefore, it can be used to
both handle huge data sets (because of large scale study or high with γ > 0 the parameter which determines the width of the RBF and
resolution data) and to directly implement statistical analyses on the Xi, Xj the vectors of the ith and jth training landslides (Li et al., 2009).
data. Therefore, this technique has been successfully applied for Detailed statistical explanation on SVM can be found in Kavzoglu and
assessing the landslide susceptibility (Tien Bui et al., 2012; Pham Colkesen (2009) and Kavzoglu et al. (2014a, 2014b). The “Kernlab”
et al., 2016; Tsangaratos and Ilia, 2016). In this study, Naïve Bayes package (Karatzoglou et al., 2016) was used in R 3.0.2 for landslide
approach was employed in R 3.0.2 using “rminer” package (Cortez, susceptibility mapping in the study area.
2015) to model the landslide susceptibility.
2.2.4. Prediction of landslide susceptibility
2.2.3.8. Quadratic discriminant analysis (QDA). Quadratic discriminant Once the ANN, BRT, CART, GLM, GAM, MARS, NB, QDA, RF, and
analysis (QDA) is one of the soft computing learning algorithms which SVM models were successfully trained in the calibration process, they
has recently used in the variety of fields such as landslide analysis were used to determine the landslide susceptibility index for every pixel
researches (Rossi et al., 2010; Murillo-García and Alcántara-Ayala, in entire study area. LSMs were reclassified into four susceptibility le-
2015). In this model, the strategy is to fit a quadratic surface that best vels: low, moderate, high, and very high susceptible, using the natural
separates the group of a case. In the current study, QDA technique was breaks (Jenks) classification method (Pradhan and Lee, 2010;

185
H.R. Pourghasemi, O. Rahmati Catena 162 (2018) 177–192

Fig. 5. AUC and RMSE values for each MLT.

Pourghasemi and Kerle, 2016). In fact, the obtained LSMs provide a et al., 2011), vegetation classification (Filippi and Jensen, 2006), mi-
zonation of areas in terms of “spatial probability” (Das et al., 2011). neral potential mapping (Tayebi and Tangestani, 2015), and environ-
mental sciences (Zhang and Xie, 2012; Williams et al., 2014). Recently,
2.2.5. Assessing the performance of spatial models Naghibi et al. (2016) and Rahmati et al. (2016) have utilized the LVQ
Recent studies have found that even when the different models are technique to quantify variables importance (VI) for groundwater po-
calibrated with the same data set, may result in different output maps; tential mapping. In this research, the relative contribution of the all
hence, predictions accuracy should be scientifically evaluated and predictor variables to landslide occurrence was assessed using LVQ al-
compared (Guzzetti et al., 2006; De Sy et al., 2013; Rahmati et al., gorithm in R statistical package. In addition, the response curves were
2017). The performance of the applied advanced machine learning derived using BRT and GAM models through model predictions on the
models for landslide susceptibility modeling can be assessed using dif- LSM with regard to each predictor variable, which can be useful to
ferent methods. In the current study, the receiver operating character- evaluate the VI analysis. Since the random selection of the training
istic (ROC) curves and Root Mean Square Error (RMSE) analysis were dataset in a model may have effects on the model's result, a set of nu-
applied to measure the predictive performance of the MLTs. Validation merous trees in RF model are considered to guarantee the stability of
phase was implemented based on the 30% landslide occurrences (i.e. the model. Subsequently, weight of each conditioning factor (i.e. based
existing landslide validation dataset), that were not used in process of on variable's contribution) and its effect on landslide occurrence in the
model building (Mason and Graham, 2002; Mohammady et al., 2012). study area was determined using RF data-driven model.
The ROC curve is a graph based on the sensitivity (also known as
true positive rate) and 1 − specificity (also known as false positive rate) 3. Results and discussion
with various cut-off thresholds, which in order to assess the prediction
accuracy quantitatively (Begueria, 2006). These rates explain how well 3.1. Landslide susceptibility models
the model and landslide factors predict the landslide. So, the area under
the ROC curves (AUC) can be considered as the statistic summary of the Predictions obtained through modeling and simulation are now seen
overall performance (Chung and Fabbri, 2003; Lee et al., 2004). The as an important objective of natural resources studies, as they will
AUC is commonly recognized as the most useful accuracy statistic for certify that the decision-making by watershed planners and engineers is
landslide susceptibility modeling (Mathew et al., 2009). In addition, the well and properly informed. For this reason, this process is necessary to
performance achieved by each MLT was evaluated using RMSE statistic be as accurate as possible. For the purpose of comparative visualization,
to determine the reliability and quality of the landslide susceptibility all ten landslide susceptibility maps produced from the ANN, BRT,
predictions when compared to observed values (Nefeslioglu et al., CART, GLM, GAM, MARS, NB, QDA, RF, and SVM models are shown in
2008a). In other words, RMSE is also computed based on the difference Fig. 5. The four susceptibility classes in this study were defined for each
between the ground truth and the model's output. model as low, moderate, high, and very high susceptible zones. The
percentages relative area of these classes for each model was then
2.2.6. Estimating independent variables importance calculated (Table 4). Concomitant with this result, it was found that RF,
Models' ability to make landslide susceptibility predictions depend BRT, ANN, and SVM generate LSMs that are spatially discontinuous,
on the contribution of the different landslide factors and hence precise while GAM, CART, GLM, MARS, and NB models produce smoother
identification of their role is still under investigation. To more under- patterns. Felicísimo et al. (2013) reported that this property is re-
stand the relationship between landslide conditioning factors and commended for punctual phenomena such as landslides. Really, im-
landslide events, and also variables contribution analysis, the learning portance of each model is related to features of each model and its
vector quantization (LVQ) is implemented. The LVQ, as an efficient considering in selection of final variables and their weighting. For ex-
prototype-based classification algorithm, is one of the most supervised ample, the SVM model doesn't trap in local minimums in comparison to
neural network methods first introduced by Kohonen (1995). This ANN model.
technique has been successfully applied in some fields such as rock type The most interpretable method and visually appealing were the BRT
classification of limestone (Patel and Chatterjee, 2016), landslide sus- and RF, which significantly performed better than the GLM. Although
ceptibility zonation (Pavel et al., 2008; Mondino et al., 2009; Pavel SVM often achieves good prediction accuracy but it required to higher

186
H.R. Pourghasemi, O. Rahmati Catena 162 (2018) 177–192

Table 4 was observed that AUC values of ANN, GAM, and NB were significantly
Landslide susceptibility classes' areas for all applied MLTs. similar to AUC values of SVM, CART, and QDA, respectively.
In regard to model validation methods, using only AUC statistic for
Model Class Relative area (%) Model Class Relative area (%)
the model performance assessment may not be the best method because
ANN Low 33.62 MARS Low 22.45 in some cases a high AUC may not be a guarantee for a high accuracy of
Moderate 28.56 Moderate 26.97 the spatial predictions (Aguirre-Gutiérrez et al., 2013). For that reason,
High 24.03 High 25.74
the RMSE measure as additional criteria also established to evaluate
Very high 13.80 Very high 24.84
BRT Low 30.71 NB Low 35.18 model predictions and support decisions in model selection. The RMSE
Moderate 30.14 Moderate 17.34 for applied MLTs ranges from 0.101 to 0.459 (Fig. 6) indicating a
High 23.48 High 22.07 substantial agreement between these models and the reality. The results
Very high 15.67 Very high 25.41 of this analysis revealed that BRT, followed by MARS performed best. In
CART Low 34.50 QDA Low 39.97
general, the RF and the BRT algorithms yielded significantly better
Moderate 24.46 Moderate 17.03
High 10.76 High 20.88 results than the other models in terms of the AUC statistic; however, the
Very high 30.28 Very high 22.13 BRT and MARS models have more balance in terms of RMSE values.
GLM Low 20.00 RF Low 21.95 Overall, it was observed that the performance of BRT was always sig-
Moderate 25.64 Moderate 30.96
nificantly higher than the other models based on the both AUC and
High 31.53 High 30.37
Very high 22.83 Very high 16.72 RMSE methods.
GAM Low 20.07 SVM Low 25.70 Although, various machine learning models for landslide suscept-
Moderate 27.92 Moderate 29.70 ibility mapping have been applied, the prediction accuracy of these
High 29.85 High 26.30 techniques is still debated (Goetz et al., 2015; Tien Bui et al., 2016). It
Very high 22.16 Very high 18.31
has been well established that the selection of the best model among of
different machine learning techniques has a significant role in the
learning time during calibration period. This finding is matched with landslide susceptibility assessment (Rossi et al., 2010; Felicísimo et al.,
previous surveys, which have applied the SVM to landslide suscept- 2013). Although, some of the MLTs had similar model accuracy, they
ibility assessment (Tien Bui et al., 2016). are unique in their individual approaches for modeling landslide sus-
ceptibility and defining the relevant relationships between the geo-
environmental factors and landslide initiation (Goetz et al., 2015).
3.2. Accuracy assessment and comparison Understanding and knowledge of these differences is essential to apply
a suitable model for a specific study goal and/or for a given study area
The model performance was evaluated in terms of model dis- (Brenning, 2008). Yilmaz (2010) stated that the use of statistical model
crimination by calculating the AUC and RMSE statistics. Interestingly, is the time-consuming in inputs process, outputs, and spatial analysis,
according to AUC method (Fig. 6), the variation in models performance while the MLTs have the advantage of automatically identifying inter-
among the MLTs was relatively high. The RF (AUC = 83.7%) and BRT actions between dependent and independent variables. Thus, MLTs are
(AUC = 80.7%) had the highest model performance. They are followed relatively easy to apply, and also the prediction accuracy typically ex-
by the SVM (AUC = 77.2%), ANN (AUC = 77.1%), NB ceeds more conventional methods (e.g. analytical hierarchy process,
(AUC = 74.9%), QDA (AUC = 74.1%), MARS (AUC = 72.8%), GAM statistical procedures) when complex interactions are present (Tien Bui
(AUC = 70.3%), and CART (AUC = 70.1%) models. However, SVM, et al., 2012).
ANN, NB, QDA, MARS, GAM, and CART exposed a sufficient perfor- Our results complement previous findings by Felicísimo et al.
mance (AUC > 70%), while GLM (AUC = 62.4%) showed a weak (2013), who applied different machine learning models including
performance. With a pairwise comparison of model performances, it multiple logistic regression (MLR), MARS, CART, and maximum en-
tropy (MaxEnt); and Youssef et al. (2016), who applied RF, BRT, GLM
and CART techniques for landslide susceptibility assessment, and then
compared their performances.

3.3. Variables contribution analysis

To assess the importance of factors and variables contribution


analysis, the LVQ algorithm was applied. The results from LVQ are
shown in Fig. 7. They indicated that distance from road (VI = ap-
proximately 64%), altitude (VI = 60%), distance from river
(VI = 59%), and drainage density (VI = 57.5%) are the most important
contribution factors, followed by slope-length (VI = 54.5%), slope as-
pect (VI = 54.3%), landuse (VI = 54.1%), distance from fault
(VI = 54%), and lithology (VI = 52%). However, in that study, plan
curvature (VI = 49%) and profile curvature (VI = almost 48.5%)
seemed to have relatively less importance for landslide susceptibility
modeling. According to Lin et al. (2010), relative importance of the
conditioning factors to a landslide model is dependent on the study area
characteristics.
Fig. 8 illustrates the response curves—obtained from BRT—for 12
landslide conditioning factors used for landslide susceptibility assess-
ment. The relationship between these factors and landslide occurrences
are as follows. In the altitude and slope angle maps, most landslides
occurred in the range of altitude between 500 and 2000 m, and LS
degree between 4 and 10. In the response curve to changes in slope
Fig. 6. Variable importance of factors according to LVQ method.
angle, it was recognized a maximum contribution at slope angles of

187
H.R. Pourghasemi, O. Rahmati Catena 162 (2018) 177–192

Fig. 7. Partial dependence plots of the landslide conditioning factors in the BRT model.

Fig. 8. Cross validation of the GAM model; “ALONE” shows potential of each factor to explain landslide distribution, “INSIDE” is regression form of variables by freedom degree, and
“DROP” indicate that if each variable drop from the final model, altitude can't compensate by combination of other factors.

188
H.R. Pourghasemi, O. Rahmati Catena 162 (2018) 177–192

about 8–30°, while steeper slopes (> 30°) lead to a constant pattern of tend to have increased landslide susceptibility. However, at high alti-
landslide proneness. This can be owed to the fact that on these areas the tude areas there are mountain summits that usually consist of weath-
soil thickness is usually thinner and hence insufficient for landslide ered rocks, whose shear strength is much higher which is agreement
occurrence (Vorpah et al., 2012). For distance from road, landslide with previous findings (Dai and Lee, 2002).
susceptibility values decreased drastically with increasing distance The BRT model yields small contributions by distance from roads
from it. Therefore, a lot of landslides occurred near the road network, (10.9%) while this factor has much higher contribution to the LVQ
indicating human influences on landslide distribution. Similar to road technique. However, the LVQ analysis and the partial dependence plots
network, it can be found that most landslides are often occurred near of BRT and the GAM showed some differences between contributions of
rivers. From drainage network view point, most landslides occurred in the landslide conditioning factors. Tien Bui et al. (2016) demonstrated
the range of drainage density between 5 and 7 km/km2. With regard to that relative importance of the landslide conditioning factors is affected
the distance from faults, most landslides occurred very close to faults by the use of techniques and evaluation criteria. Thus, factors that have
owing to seismicity and faults activity. a low contribution in a given model may be useful for another and have
Overall, the categorical layers including slope aspect and landuse significant influence on model; thus, the importance of landslide con-
have relatively strong effects on landslide occurrence. The slope aspect ditioning factors may represent a great variation. The rank order of
was the most influential factor among the categorical data sets and the variable importance was often irregular when comparing LVQ, BRT,
next dominant factors were landuse and lithology. Assessment of slope and GAM models. Yet, the highest ranked factors were generally con-
aspect revealed that south, west, and south west have a higher sus- sistent. Distance from road, altitude, drainage density, and distance
ceptibility value than other faces. In the case of the landuse type, dry- from river were the most common factors ranking high in terms of re-
farming, irrigation agriculture, forest-orchard areas exhibited relatively lative importance in all models. However, the different ranking of
higher potentiality values. However, lithology factor had relatively variables importance between different MLTs should be expected
small contribution for modeling landslide occurrence, because high- (Goetz et al., 2015). In addition, one of the main shortcomings of the
occurrence landslides cluster in the particular areas in group 1 (high/ MLTs is its simplistic approach to assess landslide susceptibility using a
low level piedmont fan, valley terrace deposits) and group 3 (marl, landslide inventory database (i.e. by merely including the point fea-
shale, swamp and marsh, marl conglomerate). The relationship be- tures) by discarding the information on the shape and dimensions of
tween lithology and landslide occurrence demonstrated that most them.
landslides occurred in groups 1, 5, and 3, respectively. However, these
lesser contributions of some factors did not mean that these variables
were useless for landslide susceptibility assessment. As discussed in 4. Conclusions
Park (2015), all independent variables did affect the final result of
model prediction when simultaneously considered with other variables. For land degradation monitoring and assessment to be accurate and
Fig. 9 indicated that only six landslide conditioning factors, including for sustainable land management to be effective in humid climate areas,
slope angle, distance from road, slope-length, distance to fault, drainage it is necessary to identify landslide-prone areas using a variety of state-
density, and altitude were selected in the final GAM model. In contrast, of-the-art machine learning models in regional scale. We have pre-
other factors were excluded during the model building and analysis. sented a comprehensive comparison of the ten advanced machine
Partial dependence plots of six predictor variables (i.e. landslide causal learning models for Ghaemshahr Township, Mazandaran Province, Iran
factors) in the GAM model for predicting landslide susceptibility were to identify areas of possible landslide susceptible based on different
illustrated by Fig. 9. The results of transformation of predictor variables landslide-related factors. The LSM maps are very necessary for the de-
in the GAM model demonstrated that hillslopes with high drainage cision-makers and planners, because they can be used to (i) scientifi-
density, near faults, near roads, low slope-length, and high slope angle cally identify areas that currently are susceptible to landslide occur-
rence, (ii) determine the importance of landslide conditioning factors

Fig. 9. Partial response curves for landslide conditioning factors of the GAM model.

189
H.R. Pourghasemi, O. Rahmati Catena 162 (2018) 177–192

through LVQ analysis which can improve land management strategies Blaschke, T., Montanarella, L. (Eds.), SAGA — Seconds Out (= Hamburger Beiträge
and further pragmatic actions, consequently, reduce land degradation, zur Physischen Geographie und Landschaftsökologie, 19), pp. 23–32.
Briand, L.C., Freimut, B., Vollei, F., 2004. Using multiple adaptive regression splines to
and (iii) ensure more sustainable landuse planning, with a particular support decision making in code inspections. J. Syst. Softw. 73, 205–217.
focus on land degradation. Briman, L., Cutler, A., 2015. Package ‘randomForest’. pp. 29 (Date/Publication 2015-
The validation process was carried out by comparing the produced 10-07).
Broséus, J., Vallat, M., Esseiva, P., 2011. Multi-class differentiation of cannabis seedlings
LSMs with the actual landslide occurrences using the ROC curve and in a forensic context. Chemom. Intell. Lab. Syst. 107, 343–350.
RMSE methods. The goodness-of-fit of the validation data for all MLTs Bucci, F., Santangelo, M., Cardinali, M., Fiorucci, F., Guzzetti, F., 2016. Landslide dis-
(except GLM) are good; however, it differs between the techniques, tribution and size in response to Quaternary fault activity: the Peloritani Range, NE
Sicily, Italy. Earth Surf. Proc. Land 41 (5), 711–720.
although the RF and the BRT indicate the highest degree of fit with AUC Chacon, J., Irigaray, C., Fernandez, T., El Hamdouni, R., 2006. Engineering geology maps:
values of the ROC curve being 83.7%for the RF technique and 80.7%for landslides and geographical information systems. Bull. Eng. Geol. Environ. 65,
the BRT technique. The prediction accuracy of the other MLTs (except 341–411.
Choobbasti, A.J., Farrokhzad, F., Barari, A., 2009. Prediction of slope stability using ar-
the GLM model) is almost the same (from 77.1 to 70.1%). In addition,
tificial neural network (case study: Noabad, Mazandaran, Iran). Arab. J. Geosci. 2 (4),
the results of our comparisons indicated that the BRT and MARS 311–319.
techniques had the best model performance, based on RMSE statistic. In Chung, C.F., Fabbri, A.G., 2003. Validation of spatial prediction models for landslide
particular, the BRT was estimated to have better generalization ability hazard mapping. Nat. Hazards 30 (3), 451–472.
Clarke, B., Fokoue, E., Zhang, H.H., 2009. Principles and Theory for Data Mining and
against other machine learning algorithms based on both validation Machine Learning. Springer, London.
procedures. Furthermore, based on LVQ analysis, it is found that the Conoscenti, C., Ciaccio, M., Caraballo-Arias, N.A., Gómez-Gutiérrez, Á., Rotigliano, E.,
most important components in landslide susceptibility modeling are Agnesi, V., 2015. Assessment of susceptibility to earth-flow landslide using logistic
regression and multivariate adaptive regression splines: a case of the Belice River
distance from road, altitude, distance from river, and drainage density; basin (western Sicily, Italy). Geomorphology 242, 49–64.
whereas plan and profile curvature had the lowest contribution. Corominas, J., van Westen, C., Frattini, P., 2014. Recommendations for the quantitative
The results from this study demonstrate the benefit of applying the analysis of landslide risk. Bull. Eng. Geol. Environ. 73, 209–263.
Cortez, P., 2015. Package ‘rminer’. pp. 59 (Date/Publication 2015-07-18).
optimal MLT with proper accuracy and lesser learning time in landslide Costanzo, D., Rotigliano, E., Irigaray, C., Jiménez-Perálvarez, J.D., Chacón, J., 2012.
susceptibility assessment. As a main advantage, MLTs utilize both Factors selection in landslide susceptibility modelling on large scale following the GIS
continuous and categorical data without the need to classify continuous matrix method: application to the River Beiro Basin (Spain). Nat. Hazards Earth Syst.
Sci. 12, 327–340.
factors (e.g. altitude, proximities, drainage density, etc.), which provide Dai, F.C., Lee, C.F., 2002. Landslide characteristics and slope instability modeling using
results that are more robust and reliable, compared to the traditional GIS, Lantau Island, Hong Kong. Geomorphology 42 (3), 213–228.
statistical models, and eases the misclassification issues. In this context, Das, I., Stein, A., Kerle, N., Dadhwal, V.K., 2011. Probabilistic landslide hazard assess-
ment using homogeneous susceptible units (HSU) along a national highway corridor
the generated landslide susceptibility map through MLTs could be
in the northern Himalayas, India. Landslides 8, 293–308.
considered as a potentially valuable tool for landuse planning, devel- De Sy, V., Schoorl, J.M., Keesstra, S.D., Jones, K.E., Claessens, L., 2013. Landslide model
oping early warning systems, and infrastructure layout and also scien- performance in a high resolution small-scale landscape. Geomorphology 190, 73–81.
tific institutions in order to effectively assess the management strategies Dickson, M.E., Perry, G.L.W., 2016. Identifying the controls on coastal cliff landslides
using machine-learning approaches. Environ. Model. Softw. 76, 117–127.
to prevent and mitigate the landslide risks. Elith, J., Leathwick, J.R., Hastie, T., 2008. A working guide to boosted regression trees. J.
Anim. Ecol. 77 (4), 802–813.
Acknowledgments Ercanoglu, M., Gokceoglu, C., 2004. Use of fuzzy relations to produce landslide sus-
ceptibility map of a landslide prone area (West Black Sea Region, Turkey). Eng. Geol.
75, 229–250.
The authors thank the Iranian Department of Water Resources Evans, I.S., 1979. An Integrated System of Terrain Analysis and Slope Mapping. Final
Management (IDWRM), Iranian Statistical Institute (ISI), and Report on Grant DA-ERO-591-73-G0040. University of Durham, England.
Fausett, L.V., 1994. Fundamentals of Neural Networks: Architectures, Algorithms, and
Meteorological Organization (MetO) for providing whole investigation Applications. Prentice-Hall, Englewood Cliffs, NJ.
reports and basic GIS data. We also thank the Editor, Prof. Cammeraat, Feizizadeh, B., Blaschke, T., Nazmfar, H., 2014. GIS-based ordered weighted averaging
and three anonymous reviewers for their constructive comments and and Dempster–Shafer methods for landslide susceptibility mapping in the Urmia Lake
Basin, Iran. Int. J. Digital Earth 7 (8), 688–708.
useful insights.
Felicísimo, A., Cuartero, A., Remondo, J., Quirós, E., 2013. Mapping landslide suscept-
ibility with logistic regression, multiple adaptive regression splines, classification and
References regression trees, and maximum entropy methods: a comparative study. Landslides 10,
175–189.
Ferentinou, M., Chalkias, C., 2013. Mapping mass movement susceptibility across Greece
Aguirre-Gutiérrez, J., Carvalheiro, L.G., Polce, C., van Loon, E.E., Raes, N., Reemer, M., with GIS, ANN and statistical methods. Landslide Sci. Practice 1 (1), 321–327.
Biesmeijer, J.C., 2013. Fit-for-purpose: species distribution model performance de- Filippi, A.M., Jensen, J.R., 2006. Fuzzy learning vector quantization for hyperspectral
pends on evaluation criteria—Dutch hoverflies as a case study. PLoS One 8, e63708. coastal vegetation classification. Remote Sens. Environ. 100, 512–530.
http://dx.doi.org/10.1371/journal.pone.0063708. Friedman, J.H., 2001. Greedy function approximation: a gradient boosting machine. Ann.
Alkhasawneh, M.S., Ngah, U.K., Tay, L.T., Isa, N.A.M., 2014. Determination of im- Statist. 29, 1189–1232.
portance for comprehensive topographic factors on landslide hazard mapping using Genuer, R., Poggi, J.M., Tuleau-Malot, C., 2010. Variable selection using random forests.
artificial neural network. Environ. Earth Sci. 72 (3), 787–799. Pattern Recogn. Lett. 31 (14), 2225–2236.
An, H., Viet, T.T., Lee, G., Kim, Y., Kim, M., Noh, S., Noh, J., 2016. Development of time- Goetz, J.N., Guthrie, R.H., Brenning, A., 2011. Integrating physical and empirical land-
variant landslide-prediction software considering three-dimensional subsurface un- slide susceptibility models using generalized additive models. Geomorphology 129,
saturated flow. Environ. Model. Softw. 85, 172–183. 376–386.
Arnone, E., Francipane, A., Scarbaci, A., Puglisi, C., Noto, L.V., 2016. Effect of raster Goetz, J.N., Brenning, A., Petschko, H., Leopold, P., 2015. Evaluating machine learning
resolution and polygon-conversion algorithm on landslide susceptibility mapping. and statistical prediction techniques for landslide susceptibility modeling. Comput.
Environ. Model. Softw. 84, 467–481. Geosci. 81, 1–11.
Atkinson, P.M., Massari, R., 1998. Generalized linear modeling of susceptibility to Gomes, A., Gaspar, J.L., Goulart, C., Queiroz, G., 2005. Evaluation of landslide suscept-
landsliding in the central Apennines, Italy. Comput. Geosci. 24, 373–385. ibility of Sete Cidades volcano (S. Miguel Island, Azores). Nat. Hazards Earth Syst.
Ayalew, L., Yamagishi, H., Ugaw, A.N., 2004. Landslide susceptibility mapping using GIS Sci. 5, 251–257.
based weighted linear combination, the case in Tsugawa area of Agano River, Niigata Gómez-Gutiérrez, Á., Schnabel, S., Felicísimo, Á.M., 2009a. Modelling the occurrence of
Prefecture, Japan. Landslides 1 (1), 73–81. gullies in rangelands of southwest Spain. Earth Surf. Process. Landf. 34, 1894–1902.
Begueria, S., 2006. Validation and evaluation of predictive models in hazard assessment Gómez-Gutiérrez, Á., Schnabel, S., Lavado Contador, F., 2009b. Using and comparing two
and risk management. Nat. Hazards 37, 315–329. nonparametric methods (CART and MARS) to model the potential distribution of
Betts, H., Basher, L., Dymond, J., Herzig, A., Marden, M., Phillips, C., 2017. Development gullies. Ecol. Model. 220, 3630–3637.
of a landslide component for a sediment budget model. Environ. Model. Softw. 92, Gómez-Gutiérrez, Á., Conoscenti, C., Angileri, S.E., Rotigliano, E., Schnabel, S., 2015.
28–39. Using topographical attributes to evaluate gully erosion proneness (susceptibility) in
Breiman, L., 2001. Random forests. Mach. Learn. 45 (1), 5–32. two mediterranean basins: advantages and limitations. Nat. Hazards 79 (1), 291–314.
Brenning, A., 2005. Spatial prediction models for landslide hazards: review, comparison Guzzetti, F., Reichenbach, P., Ardizzone, F., Cardinali, M., Galli, M., 2006. Estimating the
and evaluation. Nat. Hazards Earth Syst. Sci. 5, 853–862. quality of landslide susceptibility models. Geomorphology 81, 166–184.
Brenning, A., 2008. Statistical geocomputing combining R and SAGA: the example of Hadji, R., Boumazbeur, A.E., Limani, Y., Baghem, M., Chouabi, A.E.M., Demdoum, A.,
landslide susceptibility analysis with generalized additive models. In: Böhner, J., 2013. Geologic, topographic and climatic controls in landslide hazard assessment

190
H.R. Pourghasemi, O. Rahmati Catena 162 (2018) 177–192

using GIS modeling: a case study of Souk Ahras region, NE Algeria. Quat. Int. 302, Mathew, J., Jha, V.K., Rawat, G.S., 2009. Landslide susceptibility mapping and its vali-
224–237. dation in part of Garhwal Lesser Himalaya, India, using binary logistic regression and
Haigh, M., Rawat, J.S., 2012. Landslide disasters: seeking causes – a case study from receiver operating characteristic curve method. Landslides 6, 17–26.
Uttarakhand, India. In: Management of Mountain Watersheds, pp. 218–253. http:// McCullagh, P., Nelder, J.A., 1989. Generalized Linear Models, 2nd edn. Chapman and
dx.doi.org/10.1007/978-94-007-2476-1_18. Hall, London.
Hastie, T.J., Tibshirani, R., 1990. Generalized Additive Models. Chapman & Hall, London Mertens, K., Jacobs, L., Maes, J., Kabaseke, C., Maertens, M., Poesen, J., Kervyn, M.,
(352 pp). Vranken, L., 2016. The direct impact of landslides on household income in tropical
Hastie, T.J., Tibshirani, R., Friedman, J., 2009. The Elements of Statistical Learning, Data regions: a case study from the Rwenzori Mountains in Uganda. Sci. Total Environ.
Mining, Inference, and Prediction, Second Edition. Springer, New York. 550, 1032–1043.
Henriques, C., Zêzere, J.L., Marques, F., 2015. The role of the lithological setting on the Micheletti, N., 2011. Landslide susceptibility mapping using adaptive support vector
landslide pattern and distribution. Eng. Geol. 189, 17–31. machines and feature selection. In: A Master Thesis submitted to University of
Hijmans, R.J., Phillips, S., Leathwick, J., Elith, J., 2016. Package “dismo”. pp. 67 (Date/ Lausanne Faculty of Geosciences and Environment for the Degree of Master of
Publication 2016-06-16). Science in Environmental Geosciences, (99 pp).
Hong, H., Pradhan, B., Xu, C., Tien Bui, D., 2015. Spatial prediction of landslide hazard at Micheletti, N., Foresti, L., Robert, S., Leuenberger, M., Pedrazzini, A., Jaboyedoff, M.,
the Yihuang area (China) using two-class kernel logistic regression, alternating de- Kanevski, M., 2014. Machine learning feature selection methods for landslide sus-
cision tree and support vector machines. Catena 133, 266–281. ceptibility mapping. Math. Geosci. 46 (1), 33–57.
Hong, H., Naghibi, S.A., Pourghasemi, H.R., Pradhan, B., 2016a. GIS-based landslide Milborrow, S., Hastie, T., Tibshirani, R., 2011. Earth: Multivariate Adaptive Regression
spatial modeling in Ganzhou City, China. Arab. J. Geosci. 9 (2), 1–26. Spline Models. R Software Package.
Hong, H., Pourghasemi, H.R., Pourtaghi, Z.S., 2016b. Landslide susceptibility assessment Miller, A.J., 2013. Assessing landslide susceptibility by incorporating the surface cover
in Lianhua County (China): a comparison between a random forest data mining index as a measurement of vegetative cover. Land Degrad. Dev. 24 (3), 205–227.
technique and bivariate and multivariate statistical models. Geomorphology 259, Mohammady, M., Pourghasemi, H.R., Pradhan, B., 2012. Landslide susceptibility map-
105–118. ping at Golestan Province, Iran: a comparison between frequency ratio, Dempster-
Iranian Department of Water Resource Management (IDWRM), 2014. Report of Natural Shafer, and weights-ofevidence models. J. Asian Earth Sci. 61, 221–236.
Resources Management and Land-use Planning. Mondino, E.B., Giardino, M., Perotti, L., 2009. A neural network method for analysis of
Iranian Statistical Institute (ISI), 2016. Population Census Data from the Iranian hyperspectral imagery with application to the Cassas landslide (Susa Valley, NW-
Statistical Institute. Italy). Geomorphology 110, 20–27.
Kanungo, D.P., Arora, M.K., Sarkar, S., Gupta, R.P., 2006. A comparative study of con- Montanarella, L., 2003. The EU thematic strategy on soil protection. In land degradation
ventional, ANN black box, fuzzy and combined neural and fuzzy weighting proce- in Central and Eastern Europe. In: Jones, R.J.A., Montanarella, L. (Eds.), European
dures for landslide susceptibility zonation in Darjeeling Himalayas. Eng. Geol. 85 (3), Soil Bureau Research Report No.10, EUR 20688 EN, (2003). Office for Official
347–366. Publications of the European Communities, Luxembourg (324 pp).
Karatzoglou, A., Smola, A., Hornik, K., 2016. Package ‘kernlab’. pp. 108 (Date/ Moore, I.D., Burch, G.J., 1986. Physical basis of length–slope factor in the Universal Soil
Publication 2016-03-29). Loss Equation. Soil Sci. Soc. Am. J. 50, 1294–1298.
Kavzoglu, T., Colkesen, I., 2009. A kernel functions analysis for support vector machines Murillo-García, F., Alcántara-Ayala, I., 2015. Landslide susceptibility analysis and map-
for land cover classification. Int. J. Appl. Earth Obs. Geoinf. 11, 352–359. ping using statistical multivariate techniques: Pahuatlán, Puebla, Mexico. In: Recent
Kavzoglu, T., Sahin, E.K., Colkesen, I., 2014a. An assessment of multivariate and bivariate Advances in Modeling Landslides and Debris Flows Part of the series Springer Series
approaches in landslide susceptibility mapping: a case study of Duzkoy district. Nat. in Geomechanics and Geoengineering, pp. 179–194. http://dx.doi.org/10.1007/978-
Hazards 76 (1), 471–496. 3-319-11053-0_16.
Kavzoglu, T., Sahin, E.K., Colkesen, I., 2014b. Landslide susceptibility mapping using Naghibi, S.A., Pourghasemi, H.R., Dixon, B., 2016. GIS-based groundwater potential
GISbased multi-criteria decision analysis, support vector machines, and logistic re- mapping using boosted regression tree, classification and regression tree, and random
gression. Landslides 11, 425–439. forest machine learning models in Iran. Environ. Monit. Assess. http://dx.doi.org/10.
Keesstra, S.D., Mol, G., Zaal, A.M., Wallinga, J., Jansen, B., 2015. Soil Science in a 1007/s10661-015-5049-6.
Changing World. Wageningen Soil Conference, 2015-08-23/2015-08-27. Nasiri Aghdam, I., Morshed Varzandeh, M.H., Pradhan, B., 2016. Landslide susceptibility
Keesstra, S.D., Quinton, J.N., van der Putten, W.H., Bardgett, R.D., Fresco, L.O., 2016. The mapping using an ensemble statistical index (Wi) and adaptive neuro-fuzzy inference
significance of soils and soil science towards realization of the United Nations system (ANFIS) model at Alborz Mountains (Iran). Environ. Earth Sci. 75 (7), 1–20.
Sustainable Development Goals. Soil 2 (2), 111–128. Nefeslioglu, H.A., Duman, T.Y., Durmaz, S., 2008a. Landslide susceptibility mapping for a
Khoshravan, H., 1998. Zoning of Caspian Sea Southern Coasts Morphology. Caspian Sea part of tectonic Kelkit Valley (Eastern Black Sea region of Turkey). Geomorphology
National Research and Study Center (CSNRSC), Sari, Mazandaran, Iran. 94 (3), 401–418.
Khoshravan, H., 2007. Beach sediments, morphodynamics, and risk assessment, Caspian Nefeslioglu, H.A., Gokceoglu, C., Sonmez, H., 2008b. An assessment on the use of logistic
Sea coast. Iran. Quaternary Int. 167–168, 35–39. regression and artificial neural networks with different sampling strategies for the
Kohonen, T., 1995. Learning Vector Quantization; Self-Organizing Maps. Springer, Berlin, preparation of landslide susceptibility maps. Eng. Geol. 97 (3), 171–191.
pp. 175–189. Nefeslioglu, H.A., Sezer, E., Gokceoglu, C., Bozkir, A.S., Duman, T.Y., 2010. Assessment of
Kornejady, A., Ownegh, M., Bahremand, A., 2017a. Landslide susceptibility assessment landslide susceptibility by decision trees in the metropolitan area of Istanbul, Turkey.
using maximum entropy model with two different data sampling methods. Catena Math. Probl. Eng. http://dx.doi.org/10.1155/2010/901095.
152, 144–162. Neuhäuser, B., Damm, B., Terhorst, B., 2012. GIS-based assessment of landslide sus-
Kornejady, A., Ownegh, M., Rahmati, O., Bahremand, A., 2017b. Landslide susceptibility ceptibility on the base of the Weights-of-Evidence model. Landslides 9, 511–528.
assessment using three bivariate models considering the new topo-hydrological Oh, H.J., Pradhan, B., 2011. Application of a neuro-fuzzy model to landslide-suscept-
factor: HAND. Geocarto Int. http://dx.doi.org/10.1080/10106049.2017.1334832. ibility mapping for shallow landslides in a tropical hilly area. Comput. Geosci. 37,
Kuo, Y.M., Liu, C.W., Lin, K.H., 2004. Evaluation of the ability of an artificial neural 1264–1276.
network model to assess the variation of groundwater quality in an area of blackfoot Papathoma-Köhle, M., Zischg, A., Fuchs, S., Glade, T., Keiler, M., 2015. Loss estimation
disease in Taiwan. Water Res. 38 (1), 148–158. for landslides in mountain areas–an integrated toolbox for vulnerability assessment
Leathwick, J.R., Rowe, D., Richardson, J., Elith, J., Hastie, T., 2005. Using multivariate and damage documentation. Environ. Model. Softw. 63, 156–169.
adaptive regression splines to predict the distributions of New Zealand‘s freshwater Park, N.W., 2015. Using maximum entropy modeling for landslide susceptibility mapping
diadromous fish. Freshw. Biol. 50, 2034–2052. with multiple geoenvironmental data sets. Environ. Earth Sci. 73 (3), 937–949.
Lee, S., Chwae, U., Min, K., 2002. Landslide susceptibility mapping by correlation be- Patel, A.K., Chatterjee, S., 2016. Computer vision-based limestone rock-type classification
tween topography and geological structure: the Janghung area, Korea. using probabilistic neural network. Geosci. Front. 7, 53–60.
Geomorphology 46 (3–4), 149–162. Pavel, M., Fannin, R.J., Nelson, J.D., 2008. Replication of a terrain stability mapping
Lee, S., Ryu, J.H., Won, J.S., Park, H.J., 2004. Determination and application of the using an artificial neural network. Geomorphology 97 (3–4), 356–373.
weights for landslide susceptibility mapping using an artificial neural network. Eng. Pavel, M., Nelson, J.D., Fannin, R.J., 2011. An analysis of landslide susceptibility zonation
Geol. 71, 289–302. using a subjective geomorphic mapping and existing landslides. Comput. Geosci. 37
Lehmann, A., Overton, M., Leathwick, J.R., 2002. GRASP: generalized regression analysis (4), 554–566.
and spatial prediction. Ecol. Model. 157, 189–207. Pham, B.T., Pradhan, B., Bui, D.T., Prakash, I., Dholakia, M.B., 2016. A comparative study
Li, H.D., Liang, Y.Z., Xu, Q.S., 2009. Support vector machines and its applications in of different machine learning methods for landslide susceptibility assessment: a case
chemistry. Chemom. Intell. Lab. Syst. 95 (2), 188–198. study of Uttarakhand area (India). Environ. Model. Softw. 84, 240–250.
Liaw, A., Wiener, M., 2002. Classification and regression by random forest. R News 2 (3), Poudyal, C.P., Chang, C., Oh, H.J., Lee, S., 2010. Landslide susceptibility maps comparing
18–22. frequency ratio and artificial neural networks: a case study from the Nepal Himalaya.
Lin, Y.P., Chu, H.J., Wu, C.F., 2010. Spatial pattern analysis of landslide using landscape Environ. Earth Sci. 61, 1049–1064.
metrics and logistic regression: a case study in Central Taiwan. Hydrol. Earth Syst. Pourghasemi, H.R., Kerle, N., 2016. Random forests and evidential belief function-based
Sci. Discuss. 7 (3), 3423–3451. landslide susceptibility assessment in Western Mazandaran Province, Iran. Environ.
Magliulo, P., DiLisio, A., Russo, F., Zelano, A., 2008. Geomorphology and landslide sus- Earth Sci. 75 (3), 1–17.
ceptibility assessment using GIS and bivariate statistics: a case study in southern Italy. Pourghasemi, H.R., Goli Jirandeh, A., Pradhan, B., Xu, C., Gokceoglu, C., 2013. Landslide
Nat. Hazards 47, 411–435. susceptibility mapping using support vector machine and GIS at the Golestan
Marjanović, M., Kovačević, M., Bajat, B., Voženílek, V., 2011. Landslide susceptibility Province, Iran. J. Earth Syst. Sci. 122 (2), 349–369.
assessment using SVM machine learning algorithm. Eng. Geol. 123, 225–234. Pradhan, B., 2013. A comparative study on the predictive ability of the decision tree,
Mason, S.J., Graham, N.E., 2002. Areas beneath the relative operating characteristics support vector machine and neuro-fuzzy models in landslide susceptibility mapping
(ROC) and relative operating levels (ROL) curves: statistical significance and inter- using GIS. Comput. Geosci. 51, 350–365.
pretation. Q. J. R. Meteorol. Soc. 128, 2145–2166. Pradhan, B., Lee, S., 2010. Landslide susceptibility assessment and factor effect analysis:

191
H.R. Pourghasemi, O. Rahmati Catena 162 (2018) 177–192

backpropagation artificial neural networks and their comparison with frequency ratio Tien Bui, D., Tuan, T.A., Klempe, H., Pradhan, B., Revhaug, I., 2016. Spatial prediction
and bivariate logistic regression modelling. Environ. Model. Softw. 25 (6), 747–759. models for shallow landslide hazards: a comparative assessment of the efficacy of
Pradhan, B., Sezer, E.A., Gokceoglu, C., Buchroithner, M.F., 2010. Landslide susceptibility support vector machines, artificial neural networks, kernel logistic regression, and
mapping by neuro-fuzzy approach in a landslide-prone area (Cameron Highlands, logistic model tree. Landslides 13, 361–378.
Malaysia). IEEE Trans. Geosci. Remote Sens. 48, 4164–4177. Trigila, A., Iadanza, C., Esposito, C., Scarascia-Mugnozza, G., 2015. Comparison of logistic
R Development Core Team, 2015. R: A Language and Environment for Statistical regression and random forests techniques for shallow landslide susceptibility as-
Computing. Austria, Vienna. http://CRAN.R-project.org. sessment in Giampilieri (NE Sicily, Italy). Geomorphology 249, 119–136.
Rahmati, O., Haghizadeh, A., Pourghasemi, H.R., Noormohamadi, F., 2016. Gully erosion Tsangaratos, P., Ilia, I., 2016. Comparison of a logistic regression and Naïve Bayes clas-
susceptibility mapping: the role of GIS-based bivariate statistical models and their sifier in landslide susceptibility assessments: the influence of models complexity and
comparison. Nat. Hazards 82 (2), 1231–1258. training dataset size. Catena 145, 164–179.
Rahmati, O., Tahmasebipour, N., Haghizadeh, A., Pourghasemi, H.R., Feizizadeh, B., Tseng, C.M., Lin, C.W., Hsieh, W.D., 2015. Landslide susceptibility analysis by means of
2017. Evaluating the influence of geo-environmental factors on gully erosion in a event-based multi-temporal landslide inventories. Nat. Hazard Earth Sys. 3 (2),
semi-arid region of Iran: An integrated framework. Sci. Total Environ. 579, 913–927. 1137–1173.
Ridgeway, G., 2007. Generalized Boosted Models: A Guide to the gbm Package. pp. 12. Vahidnia, M.H., Alesheikh, A.A., Alimohammadi, A., Hosseinali, F., 2010. A GIS-based
Rodionov, S.N., 1994. Global and Regional Climatic Interaction: The Caspian Sea neuro-fuzzy procedure for integrating knowledge and data in landslide susceptibility
Experience. Kluwer Academic Publishing, Dordrecht. mapping. Comput. Geosci. 36 (9), 1101–1114.
Romer, C., Ferentinou, M., 2016. Shallow landslide susceptibility assessment in a semi- Van Westen, C.J., Terlien, M.T.J., 1996. An approach towards deterministic landslide
arid environment, — a quaternary catchment of KwaZulu-Natal, South Africa. Eng. hazard analysis in GIS. A case study from Manizales (Colombia). Earth Surf. Process.
Geol. 201 (9), 29–44. Landf. 21, 853–868.
Rossi, M., Guzzetti, F., Reichenbach, P., Mondini, A.C., Peruccacci, S., 2010. Optimal Vapnik, V.N., 2000. The Nature of Statistical Learning, Second edition. Springer, New
landslide susceptibility zonation based on multiple forecasts. Geomorphology 114, York.
129–142. Vorpah, P., Elsenbeer, H., Märker, M., Schröder, B., 2012. How can statistical models help
Saito, H., Nakayama, D., Matsuyama, H., 2009. Comparison of landslide susceptibility to determine driving factors of landslides? Ecol. Model. 239, 27–39.
based on a decision-tree model and actual landslide occurrence: the Akaishi Wang, Y., Song, C., Lin, Q., Li, J., 2016. Occurrence probability assessment of earthquake-
Mountains, Japan. Geomorphology 109 (3–4), 108–121. triggered landslides with Newmark displacement values and logistic regression: the
Samadi, M., Jabbari, E., Azamathulla, H.M., 2014. Assessment of M5 model tree and Wenchuan earthquake, China. Geomorphology 258, 108–119.
classification and regression trees for prediction of scour depth below free overfall Water Resources Company of Mazandaran (WRCM), 2015. Precipitation and temperature
spillways. Neural Comput. & Applic. 24, 357–366. reports. http://www.mazmet.ir/, Accessed date: 8 November 2015.
Schapire, R., 2003. The boosting approach to maching learning – an overview. In: Weng, M.C., Wu, M.H., Ning, S.K., Jou, Y.W., 2011. Evaluating triggering and causative
Denison, D.D., Hansen, M.H., Holmes, C., Mallick, B., Yu, B. (Eds.), MSRI Workshop factors of landslides in Lawnon River Basin, Taiwan. Eng. Geol. 123 (1), 72–82.
on Nonlinear Estimation and Classification, 2002. Springer, New York, USA, pp. Williams, R.N., de Souza, P.A.J., Jones, E.M., 2014. Analyzing coastal ocean model out-
1–21. puts using competitive learning pattern recognition techniques. Environ. Model.
Schilirò, L., Montrasio, L., Mugnozza, G.S., 2016. Prediction of shallow landslide occur- Softw. 57, 165–176.
rence: validation of a physically-based approach through a real case study. Sci. Total Wu, X., Kumar, V., Quinlan, J.R., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G.J., Ng,
Environ. 569–570, 134–144. A., Liu, B., 2008. Top 10 algorithms in data mining. Knowl. Inf. Syst. 14 (1), 1–37.
Segal, M.R., 2004. Machine Learning Benchmarks and Random Forest Regression. Center Wu, X., Ren, F., Niu, R., 2014. Landslide susceptibility assessment using object mapping
for Bioinformatics and Molecular Biostatistics UC, San Francisco. http://eprints. units, decision tree, and support vector machine models in the Three Gorges of China.
cdlib.org/uc/item/35x3v9t4. Environ. Earth Sci. 71 (11), 4725–4738.
Sezer, E.A., Pradhan, B., Gokceoglu, C., 2011. Manifestation of an adaptive neurofuzzy Yao, X., Tham, L.G., Dai, F.C., 2008. Landslide susceptibility mapping based on Support
model on landslide susceptibility mapping: Klang valley, Malaysia. Expert Syst. Appl. Vector Machine: a case study on natural slopes of Hong Kong, China. Geomorphology
38, 8208–8219. 101, 572–582.
Shafiei, A.B., Akbarinia, M., Jalali, G., Hosseini, M., 2010. Forest fire effects in beech Yesilnacar, E., Topal, T., 2005. Landslide susceptibility mapping: a comparison of logistic
dominated mountain forest of Iran. For. Ecol. Manag. 259 (11), 2191–2196. regression and neural networks methods in a medium scale study, Hendek region
Soria, D., Garibaldi, J.M., Ambrogi, F., Biganzoli, E.M., Ellis, I.O., 2011. A non-para- (Turkey). Eng. Geol. 79, 251–266.
metricversion of the naive Bayes classifier. Knowl.-Based Syst. 24 (6), 775–784. Yilmaz, I., 2010. The effect of the sampling strategies on the landslide susceptibility
Tayebi, M.H., Tangestani, M.H., 2015. Sub pixel mapping of alteration minerals using mapping by conditional probability (CP) and artificial neural network (ANN).
SOM neural network model and hyperion data. Earth Sci. Inf. 8 (2), 279–291. Environ. Earth Sci. 60, 505–519.
Thai Pham, B., Tien Bui, D., Pourghasemi, H.R., Indra, P., Dholakia, M.B., 2015. Landslide Youssef, A.M., Pourghasemi, H.R., Pourtaghi, Z.S., Al-Katheeri, M.M., 2016. Landslide
susceptibility assesssment in the Uttarakhand area (India) using GIS: a comparison susceptibility mapping using random forest, boosted regression tree, classification
study of prediction capability of naïve bayes, multilayer perceptron neural networks, and regression tree, and general linear models and comparison of their performance
and functional trees methods. Theor. Appl. Climatol. http://dx.doi.org/10.1007/ at Wadi Tayyah Basin, Asir Region, Saudi Arabia. Landslides 13 (5), 839–856.
s00704-015-1702-9. Zare, M., Pourghasemi, H.R., Vafakhah, M., Pradhan, B., 2013. Landslide susceptibility
Thanh, L., De Smedt, F., 2014. Slope stability analysis using a physically based model: a mapping at Vaz Watershed (Iran) using an artificial neural network model: a com-
case study from a Luoi district in Thua Thien-Hue province, Vietnam. Landslides 11l, parison between multilayer perceptron (MLP) and radial basic function (RBF) algo-
897–907. rithms. Arab. J. Geosci. 6 (8), 2873–2888.
Tien Bui, D., Pradhan, B., Lofman, O., Revhaug, I., Dick, O.B., 2012. Landslide suscept- Zhang, C., Xie, Z., 2012. Combining object-based texture measures with a neural network
ibility mapping at Hoa Binh province (Vietnam) using an adaptive neuro-fuzzy in- for vegetation mapping in the Everglades from hyperspectral imagery. Remote Sens.
ference system and GIS. Comput. Geosci. 45, 199–211. Environ. 124, 310–320.

192

You might also like