0% found this document useful (0 votes)
12 views27 pages

Machine Learning-Driven Wildfire Susceptibility Mapping

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views27 pages

Machine Learning-Driven Wildfire Susceptibility Mapping

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Natural Hazards (2025) 121:15331–15357

https://doi.org/10.1007/s11069-025-07395-w

ORIGINAL PAPER

Machine learning-driven wildfire susceptibility mapping


in New South Wales, Australia using remote sensing and
explainable artificial intelligence

Rufai Yusuf Zakari1 · Owais Ahmed Malik1 · Ong Wee-Hong1

Received: 28 August 2024 / Accepted: 18 May 2025 / Published online: 2 June 2025
© The Author(s), under exclusive licence to Springer Nature B.V. 2025

Abstract
Wildfires are among the most devastating environmental disasters threatening the Austra-
lian community, causing significant negative impacts on ecosystems and socio-economic
activities. This fact suggests the importance of understanding wildfire tendencies, patterns,
and vulnerabilities to conserve ecosystems and develop effective prevention and manage-
ment strategies. In this study, we present a method for generating a dataset of fire events
using freely available remote sensing data via Google Earth Engine. Additionally, we
evaluate the performance of four machine learning (ML) models: Random Forest (RF),
Extreme Gradient Boosting (XGBoost), Adaptive Boosting (AdaBoost), and Support Vec-
tor Machines (SVMs) in developing a wildfire susceptibility map for New South Wales
(NSW). These ML techniques were assessed based on 15 independent wildfire-related
factors, grouped into four main categories: climate, environment, topography, and socio-
economic factors. Six performance metrics were used to compare the predictive perfor-
mance of the ML algorithms: accuracy, cross-validation accuracy, precision, recall, Kappa,
and F1 score. Our results show that XGBoost outperforms all other models, achieving
an F1 score of 0.965, a Kappa of 0.937, an accuracy of 0.968, and a cross-validation ac-
curacy of 0.974. Furthermore, the SHapley Additive exPlanations (SHAP) technique was
employed to interpret the model’s learning process, revealing that precipitation, drought,
soil moisture, and NDVI were the most influential factors. The study not only highlights
the probability of fire occurrences across NSW, Australia, but also identifies the key driv-
ing factors of wildfires during the 2019–2020 summer season. Local authorities can utilize
the wildfire susceptibility map generated in this study for wildfire management and fire
suppression activities.

Keywords Wildfire · New South Wales · Susceptibility mapping · Forest fire · Machine
learning

Extended author information available on the last page of the article

13
15332 Natural Hazards (2025) 121:15331–15357

1 Introduction

Forests are vital natural resources that maintain environmental balance (Sgroi 2020).
Recently, the demand for natural resources, particularly those found in forests, has surged
due to the need for food and materials. This increase is driven by population growth and
industrial development near forest-rich areas (Sohag et al. 2023). Both human activities
and natural events contribute to wildfires, affecting approximately 30% of the world’s for-
ested regions and leading to significant losses in biodiversity and ecosystems (Shahfahad
et al. 2022). The risk of wildfires has risen in many forested regions globally due to these
anthropogenic changes. Wildfires are a significant threat to plant communities and signifi-
cantly impact ecosystem dynamics (Rihan et al. 2023). These fires and activities by farm-
ers and foresters to increase agricultural land contribute to the global challenge of losing
16 million hectares of forest annually (Debebe et al. 2023). In some cases, wildfires can
profoundly affect nutrient cycling and ecosystem functioning. However, they often harm
biology, ecology, and the environment, leading to altered soil composition, erosion, and
species extinction (Succarie et al. 2022). Beyond these direct environmental costs, wildfires
also significantly impact the economy, particularly for millions worldwide who rely on for-
est products, ecotourism, and agriculture for their livelihoods (Laudari et al. 2021). Devel-
oping wildfire susceptibility maps is essential to mitigate the risk of wildfires effectively.
In recent years, the use of geographical information systems (GIS), remote sensing (RS),
and machine learning algorithms (MLAs) has increased significantly in hazard susceptibility
mapping (Rihan et al. 2023). These techniques have seen widespread adoption in address-
ing the complexities of spatial and temporal susceptibility modeling for natural hazards in
different parts of the world (Pourghasemi et al. 2023), including wildfires, droughts, and
floods (Ahmed et al. 2022; Alqadhi et al. 2022; Rihan et al. 2023). Several wildfire studies
have extensively used MLAs and other RS approaches for mapping and monitoring wildfire
susceptibility (Succarie et al. 2022). Australia has experienced several significant wildfire
events, including the Gippsland wildfires and Black Sunday (1926), Black Friday (1939),
the 1974–1975 Australian Wildfire Season, the Waterfall wildfire (1980), the Canberra For-
est fire (2003), and Black Saturday (2009) (Weber et al. 2019). However, the 2019–2020
wildfire season, primarily affecting the southeastern region of New South Wales (NSW), has
been the most devastating in terms of burned area and severity since European settlement
(Ma et al. 2020). Known as the Black Summer, this season resulted in 28 human fatalities
(John Roach 2020), the loss of over 1.25 billion animals (Tegna, 2020), the destruction of
more than 3,000 homes (Jessie Yeung 2020), and economic damages exceeding $110 billion
by January 2020 (John Roach 2020). These wildfires are primarily caused by natural fac-
tors like lightning and severe drought, exacerbated by climate change (Haque et al. 2021).
The NSW region, with its vast and varied landscapes, is particularly vulnerable to wildfires.
Wildfires in NSW are more common in the summer, with areas such as the Blue Mountains,
Central Coast, and the South Coast frequently affected (Nguyen et al. 2021). These areas
face a disproportionate concentration of wildfires each year, posing significant threats to
ecosystem services and biodiversity (Filkov et al. 2020). According to the NSW Rural Fire
Service, regions like the Blue Mountains are highly susceptible to wildfires, with thousands
of hectares of forest destroyed annually during severe fire seasons. Meanwhile, essential
timber resources are lost, species of flora and fauna are endangered and driven to extinction,
and severe forest ecology is damaged, affecting the socio-economic environment. There-

13
Natural Hazards (2025) 121:15331–15357 15333

fore, modeling wildfire susceptibility is central to wildfire management and emergency


response (Deb et al. 2020). These advances in computational power have given rise to the
next wave of betterment in MLAs, which are now integrated into RS and GIS. They produce
more accurate and reliable wildfire susceptibility mapping (WSM). MLAs are considered
much superior to conventional statistical methods for handling high-dimensional and com-
plex nonlinear datasets; thus, they are increasingly employed in WSM research rather than
static models. Over the last decade, more studies have used single- or multi-modal ML
techniques to assess and improve WSM. A common outcome of these studies is a practical
enhancement in map precision. The algorithms employed in wildfire analysis for mapping
include artificial neural networks (ANN), support vector machines (SVM), random forest
(RF) (Moghim and Mehrabi 2024), generalized additive model (GAM), classification and
regression tree (CART), gradient boosting machine (GBM), dynamic Bayesian network
(DBN), logistic regression (LR), eXtreme Gradient Boosting (XGBoost), Naïve Bayes Tree
(NBT), and Adaptive Boosting (AdaBoost) (Iban and Sekertekin 2022). Recently, there has
been growing interest in cloud-based platforms, such as Google Earth Engine (GEE), for
their application in hazard prediction and assessing wildfire susceptibility (Shahfahad et al.
2022).
Integrating GIS, RS, and MLAs has significantly advanced the spatial representation
and prediction of wildfire risk (Beltrán-Marcos et al. 2023). Despite these technological
advancements, the vulnerability of the entire NSW region to wildfires has not been exten-
sively investigated. Most studies have concentrated on specific areas within NSW, such as
individual forests or towns, instead of providing a comprehensive assessment of the entire
region. This challenge in the literature is critical as it limits the understanding of broader
wildfire risk patterns that span across diverse geographical and environmental conditions
in NSW. Data-driven modeling, which utilizes statistical and data mining techniques, has
proven effective in making predictions by analyzing regional environmental factors (Shah-
fahad et al., 2022). Many existing studies on wildfire prediction rely on the Fire Informa-
tion for Resource Management System (FIRMS) datasets from MODIS, which provide fire
active hotspot data at a 1 km2 resolution (Hosseini and Lim 2022; Thi Hang et al. 2024).
This resolution often results in less precise wildfire modeling and can lead to false detec-
tions of fire events. This limitation has been highlighted in various studies, which emphasize
the need for higher-resolution data to improve the precision of fire detection and modeling
of susceptibility mapping (Gragnaniello et al. 2024; Zhang et al. 2024). Another significant
barrier to effective wildfire management is the lack of trust, explainability, and transparency
associated with the ML models. These models are often considered “black boxes” because
they operate on vast amounts of data using complex algorithms that are challenging to
interpret (Talukdar et al. 2024; Zhang et al. 2016). The complexity and opaqueness of these
models make it difficult for users to understand how conclusions are reached, undermining
confidence in their results and limiting their practical application in real-world scenarios.
The literature indicates that while ML techniques are increasingly employed to assess and
spatially map wildfire susceptibility, there is a notable challenge in explaining and interpret-
ing these models. Researchers and decision-makers often face challenges understanding
the rationale behind model predictions, which can hinder effective decision-making and
implementation (Abdollahi and Pradhan 2023). The need for models that provide accurate
predictions and clearly explain their processes and outcomes is crucial for their adoption
and use in wildfire management. Explainable Artificial Intelligence (XAI) has been pro-

13
15334 Natural Hazards (2025) 121:15331–15357

posed as a solution to address the black box problem by developing AI systems that are
more transparent and interpretable (Alqadhi et al. 2023). XAI focuses on creating models
that humans can easily understand and explain, enhancing trust and facilitating better deci-
sion-making. By making AI systems more transparent, XAI aims to bridge the gap between
complex ML algorithms and their practical application in real-world contexts, including
wildfire management.
To address these challenges in wildfire susceptibility modeling, this study utilizes the
existing predictive strengths of four MLAs: RF, AdaBoost, XGBoost, and SVM. Each
model was selected for its unique advantages: AdaBoost enhances the accuracy of weak
classifiers, XGBoost offers scalability and efficiency for large datasets, and RF is robust
against overfitting and provides clear interpretability. The study aims to develop an ML-
driven wildfire susceptibility map for NSW in Australia, integrating XAI techniques to
enhance the transparency and understandability of the models’ decision-making processes,
thereby addressing the common issue of interpretability in ML. This study builds wildfire
occurrence datasets with a 10 m resolution using Google Earth Engine (GEE) and its satel-
lite image collections. This approach builds on freely available datasets, such as FIRMS
and Sentinel-2 mission data, to enable more precise identification of fire locations and high-
risk, fire-prone zones. The performance of the four ML techniques was evaluated based on
wildfire susceptibility mapping, using fifteen independent factors grouped into topographi-
cal, meteorological, vegetation, and socio-economic groups. A multicollinearity analysis
was also conducted to explore relationships among these factors. Model performance is
assessed using six metrics: accuracy, precision, recall, F1 score, Kappa, and cross-validation
accuracy. Additionally, the Shapley Additive explanations (SHAP) technique is employed
to interpret the models’ decision-making processes. The research aims to provide valuable
insights for regional and municipal authorities and policymakers to help mitigate wildfire
risks and implement effective management strategies, ultimately preventing potential eco-
logical damage.

2 Materials and methods

2.1 Study area

The study area encompasses the state of New South Wales (NSW) in the eastern region
of Australia, spanning latitudes from 28° 15′S to 37° 30′ S and longitudes from 141° E to
153° 30′ E, as depicted in Fig. 1. NSW occupies an area of 801,150 square kilometers, with
elevations ranging from − 7 to 2175 m above sea level. To its north lies Queensland, while
South Australia borders its western side. Victoria is situated to the south of NSW. The Coral
and Tasman Seas delimit the eastern boundaries of NSW. Vegetation cover within NSW
predominantly comprises grasslands, shrublands, savannas, and various forest types.

2.2 Methods

The methodology of this study comprised several key steps. Initially, data were collected
using the GEE platform and subsequently prepared for analysis. This preparation involved
constructing a comprehensive training dataset, which included historical wildfire occur-

13
Natural Hazards (2025) 121:15331–15357 15335

Fig. 1 Map of Australia (left) highlighting the state of New South Wales (NSW) in a red bounding box.
The right panel provides a detailed view of NSW, showing the spatial distribution of burnt areas in red
during 2019–2020, indicating the extent and distribution of wildfire-affected regions

rences as the dependent factor and key wildfire determinants, such as topographic, meteoro-
logical, anthropogenic, and vegetation-related variables as the independent factors. Before
model development, a correlation analysis and multicollinearity assessment were conducted
using Spearman’s correlation coefficients, Variance Inflation Factor (VIF), and Tolerance
values to ensure the dataset’s suitability for robust modeling. Following this, the dataset was
partitioned into training and testing subsets. Cross-validation was employed during the ML
models’ training phase to enhance the reliability and generalizability of the model outcomes.
The models were subsequently trained on the training dataset, and their performance was
validated using the testing dataset. Model performance was evaluated through a comprehen-
sive set of metrics, including accuracy, F1 score, kappa statistic, recall, precision, and con-
fusion matrix analysis. Based on these evaluations, the best performing model was selected
to generate the wildfire susceptibility map. Finally, the interpretability of the selected model
was assessed using SHAP to elucidate the contribution of each independent factor to the
model’s predictions. Figure 2 presents the methodological flow chart used in this study.

2.3 Data collection and Preparation

In this study, we adapted the data collection approach proposed by (Sulova and Arsanjani
2021), assembling a dataset that includes independent variables such as land cover and
temperature, along with a dependent variable indicating the occurrence of fire or no-fire
events. Data pre-processing was critical in preparing the dataset for wildfire susceptibility
modeling. Covering the period from 2019 to 2020, this approach facilitated the efficient
preparation and incorporation of a larger volume of training data, thereby enhancing the
performance of the machine-learning models.

2.3.1 Dependent factor

A key component in constructing a wildfire susceptibility model is the development of a


precise and comprehensive inventory that differentiates between areas affected by fire (pres-

13
15336 Natural Hazards (2025) 121:15331–15357

Fig. 2 The flowchart illustrates the methodology used in this study

ence) and those unaffected (absence). Our study constructs this inventory by integrating
historical fire frequency data, enabling the capture of temporal patterns in fire occurrences.
However, due to the lack of high-resolution datasets for recent fire occurrences from official
Australian sources, an automated workflow was devised to collect data on fire and non-
fire occurrence regions. This process is executed individually for each month of the fire
season to account for evolving vegetation patterns that may influence the results (Fig. 3).
In this study, the Sentinel-2 mission was leveraged to identify burnt regions by applying
the Normalized Burn Ratio (NBR). The most significant differences in spectral responses
between healthy vegetation and burnt regions typically occur in the spectrum’s near-infra-
red (NIR) and shortwave infrared (SWIR) areas. The NBR formulation incorporates these
wavelengths as follows:

N IRB8 − SW IRB12
N BR = (1)
N IRB8 + SW IRB12

A higher difference of N BR, calculated as:

dNBR = NBRPrefire − NBRPostfire  (2)

indicates more severely affected areas, while negative values may suggest recovery after
a fire event. The dNBR, designed for mapping burn severity, is based on multispectral
imagery, and its values can be interpreted following the guidelines from the United States
Geological Survey (USGS). This methodology discriminates burnt areas from healthy veg-
etation, providing robust wildfire detection and analysis metrics. Integrating NBR in the
preprocessing phase ensures precise identification of fire-affected regions, which is funda-

13
Natural Hazards (2025) 121:15331–15357 15337

Fig. 3 The process depicts the process of identifying locations for fire occurrences

mental for the training and testing of the ML models deployed in this study. This compre-
hensive approach enhances the accuracy of wildfire susceptibility mapping and contributes
significantly to developing effective wildfire management and mitigation strategies.
The process leverages data from two primary satellite missions, FIRMS and Sentinel-2,
to facilitate the acquisition of fire occurrence locations. FIRMS provides aggregated active
fire locations for over one month, derived from daily observations across NSW using a
bounding box of 1 km2. Subsequently, the areas identified as fire locations by FIRMS are
vectorized. Following this, Sentinel-2 imagery, renowned for its high spatial resolution, is
employed. This incorporates cloud and cirrus masks generated through atmospheric correc-
tion processes to ensure the production of cloud-free images, thereby minimizing potential
inaccuracies in surface analysis. The Normalized Difference Water Index (NDWI), derived

13
15338 Natural Hazards (2025) 121:15331–15357

from Sentinel-2’s green (B8) and shortwave-infrared (B11) bands, is then applied to exclude
water areas from the analysis as follows:

N IRB8 − SW IRB11
N DW I = (3)
N IRB8 + SW IRB11

Figure 4 illustrates the distinction between the original image and the cloud and water-free
image after processing.
The subsequent step involves computing the differenced Normalized Burn Ratio (dNBR).
Pre-fire NBR values are calculated from six days before the beginning of the month up to
the start of that month, while post-fire NBR values are derived from the end of the month
up to six days afterward, as proposed in (Bretreger et al. 2024; Sulova and Arsanjani 2021).
This methodology helps highlight burnt areas and assess burn severity by subtracting the
pre-fire NBR from the post-fire NBR, a process known as dNBR obstruction. However,
it’s important to note that changes in natural vegetation due to other activities like defor-
estation or harvesting might also be detected by this method, not just changes due to fire.
Although the assessment period is relatively short, lasting only one month, a threshold value
of 0.44 for the dNBR has been established by the European Forest Fire Information Service
(EFFIS) (Llorens et al. 2021) to classify areas with moderate-high or high-severity burns.
This specific threshold was chosen because it effectively distinguishes between areas that
have experienced significant vegetation loss or damage due to fire from those with less
severe or no burns. The threshold was then applied within the active fire vector region
acquired from the FIRMS dataset. The objective is to exclude minor changes in natural
vegetation and improve computational efficiency, given that the calculation is limited to the
FIRM’s fire vector areas. The approach combines burnt and fire areas to achieve a balanced
result, considering that burnt areas may underestimate outcomes while active fire data might
overestimate them.
The process begins by vectorizing the selected burnt regions within the fire area bound-
ary boxes. The area of each selected burnt area is then calculated, and only areas more
prominent than 0.25 km2 (equivalent to a 500 m x 500 m raster) are considered. This crite-
rion ensures that the random points are situated within significant burnt areas, as represented
by pixels covering this specified area threshold. A random point function generates points
outside the FIRM’s vector areas for no-fire point selection. The random point function gen-

Fig. 4 Cloud-masked composite left, non-cloud & water-free masked composite right

13
Natural Hazards (2025) 121:15331–15357 15339

erates 300 points representing fires and 300 points representing no fires for each month
containing 600 points. These data are then merged into a final file. Each entry in the final
file includes an integer value for “Fire”: 1 indicates the occurrence of a fire, while 0 denotes
the absence of a fire.

2.3.2 Independent factors

In This study, we selected 15 independent factors through a combination of field-based


observations used in various studies and satellite imagery available through the Google
Earth Engine platform. These independent factors for predicting wildfires are categorized
into five groups: topography, vegetation type, infrastructure, meteorology, and socio-eco-
nomic factors. Table 1 provides a summary of each dataset utilized in this study.
Topographic: The three key factors represented in the topographic category are eleva-
tion, slope, and aspect, all shown in Fig. 5. Data on elevation were derived from a digi-
tal elevation model (DEM) with a 30-meter spatial resolution, developed using NASA’s
Shuttle Radar Topography Mission (SRTM). We derived two more factors from this DEM:
slope (intended to describe the angle of the steepness of the terrain) and aspect (to indi-
Table 1 Data description Category Data Layers Source of Data Spatial
Data Type Resolution
Topography Elevation SRTM Raster 30 m
Digital
Elevation
Slope SRTM Raster 30 m
Aspect SRTM Raster 30 m
Environment Soil Depth CSIRO Raster ~ 90 m
SLGA
Soil Terra Raster ~ 4 km
Moisture Climate
Land Cover Copernicus Raster 100 m
CGLS-
LC100
NDVI MODIS Raster 250 m
NDVI
Drought Terra Raster ~ 4 km
Severity Climate
Index
Weather Precipitation Terra Raster ~ 4 km
Climate
Maximum Terra Raster ~ 4 km
Temperature Climate
Wind Speed Terra Raster ~ 4 km
Climate
Human Impact Human World Raster ~ 85 m
Population Population
Distributions
Global CSP gHM5 Raster 1 km
Human
Modification
Electric Line OSM Vector 500 m
Road OSM Vector 500 m
Network

13
15340 Natural Hazards (2025) 121:15331–15357

Fig. 5 Topographical factors: elevation, aspect, and slope

Fig. 6 Climate factors: precipitation, maximum temperature, and wind speed

cate the compass direction a slope faces). These topographic factors are very important for
understanding how landscape features control wildfire susceptibility and provide necessary
insights into the role of the terrain in fire behavior and spread.
Weather: The weather variables category, illustrated in Fig. 6, consists of key factors
significantly influencing wildfire susceptibility: precipitation accumulation, maximum tem-
perature, and wind speed. These crucial climate data were obtained from the Terra Climate
dataset. The data processing was consistent across all three factors, involving data collec-
tion from September through December 2019. We applied the mean statistic function to the
collected data to derive representative values for this period. This approach allowed us to
capture the general climatic conditions during the months leading up to and including the
early part of the fire season, providing crucial insights into the atmospheric factors influenc-
ing wildfire susceptibility in the study area.
Environmental: The environmental category, as shown in Fig. 7, incorporates several
crucial factors: land cover, soil depth, soil moisture, drought severity index (DSI), and the
Normalized Difference Vegetation Index (NDVI). Land cover data, with a 100-meter spatial
resolution for 2015–2019, was obtained from the Copernicus Global Land Service. The
Soil and Landscape Grid of Australia dataset provided soil depth information. Both the
soil moisture and DSI were derived from the TerraClimate dataset, which employs climati-
cally aided interpolation to combine high-spatial-resolution climatological normals from
WorldClim with time-varying data from Climatic Research Unit (CRU) Ts4.0 and Japanese
55-year Reanalysis (JRA55) (Abatzoglou et al. 2018). For this study, mean values of soil
moisture and DSI were calculated from September to December 2019 due to data availabil-
ity constraints, though ideally covering the entire fire season.
Ideally, these should cover the entire fire season. The NDVI, sourced from the MOD13Q1
product at 250-meter resolution, was generated by averaging data throughout the fire sea-

13
Natural Hazards (2025) 121:15331–15357 15341

Fig. 7 Environmental factors: land cover, soil depth, soil moisture, drought severity index, and NDVI

Fig. 8 Socio-economic factors: GHM, population, electric lines, and distance from roads

son. These environmental factors collectively offered a comprehensive view of the ecologi-
cal conditions influencing wildfire susceptibility in the study area.
Human impact: As depicted in Fig. 8, the human impact category encompasses vari-
ous datasets to quantify human modification of terrestrial landscapes. One key dataset is
the Global Human Modification (GHM), which comprehensively measures human impact
on land at a spatial resolution of 1 km. GHM values (0–1) represent the extent of human
modification associated with various stressors such as human settlements, transportation
infrastructure, mining activities, and energy production facilities. Additionally, population
data from the World Pop dataset offered estimates of the population density within approx-
imately 85-meter grid cells. This information provided insights into the distribution and
density of human settlements across the landscape. Furthermore, vector data representing
electric lines and road networks, sourced from OpenStreetMap (OSM), were integrated into
the GEE platform. These vector datasets were rasterized to a resolution of 500 m, enabling

13
15342 Natural Hazards (2025) 121:15331–15357

analysis of their spatial relationship with other environmental factors. For road network
data, GEE’s cumulative cost function was utilized to compute distances from roads, facili-
tating the assessment of accessibility and infrastructure impacts on the landscape.

2.4 Training

After generating the training points for fire and non-fire occurrences and preprocessing the
conditional factors, the subsequent stage involved constructing the training dataset by aug-
menting it with predictor values; the 15 independent factors were combined to form a com-
posite image of 15 bands. Moreover, due to the current constraint of ML models within the
GEE platform, we exported our datasets to other libraries with a broader range of available
models. In GEE, the options for selecting ML models are restricted. The dataset contain-
ing fire and no-fire locations was split into training and test sets for model validation using
a method that involves adding a new column to the dataset with assigned values for the
split. The data points were divided into a 70:30 ratio, with 70% used for training and 30%
for testing. Accuracy assessment was performed on the test set using a confusion matrix
to evaluate the model’s performance. To ensure reproducibility, the dataset was shuffled
before splitting, and a seed was set, enabling consistent train-test splits across multiple runs.
Additionally, to enhance the performance of machine-learning models and avoid overfitting,
we opted for a Cross-validation technique, specifically a Stratified K-Fold cross-validation
splitter, which maintains the proportion of target classes in each subset (fold). This approach
involved dividing the dataset into 30 folds, with some subsets used for training the model
and others for validation, allowing for better generalization of the models by leveraging
multiple subsets during the training process (Table 2).

2.5 ML algorithms

Random Forest (RF) Random Forest is an ensemble learning method that generates numer-
ous decision trees during the training phase. For classification tasks, it outputs the most fre-
quent class, and for regression tasks, it calculates the average prediction from the individual
trees (Zhou et al. 2023). Key parameters influencing its performance include n_estimators,
which controls the number of trees in the forest, often leading to better model performance
with higher values; max_depth, determining the depth of each tree to prevent overfitting;
can be controlled by setting min_samples_split and min_samples_leaf, which determine the
minimum number of samples needed to split an internal node and the minimum number of
samples required to be present in a leaf node respectively, affecting the tree’s growth and
complexity; and max_features, specifying the number of features to consider when looking
for the best split, impacting the diversity and accuracy of the forest (Zhou et al. 2023).

Table 2 Hyper-parameters Algorithm Hyperparameters Value


RF n_estimators, max_features, max_depth, 50, 2, None,
criterion, learning_rate Gini, 0.1
SVM kernel type, C, gamma GRBF, 10,
0.1
XGBoost n_estimators, max_features (colsample_ 50, 1, 8, 0.1
bytree), max_depth, learning_rate
Ada Boost n_estimators, max_features, max_depth 50, 2, None

13
Natural Hazards (2025) 121:15331–15357 15343

Support Vector Machines (SVM), a cornerstone of supervised learning, excel in high-


dimensional spaces. It uses a set of mathematical functions known as kernels to transform
the input data space into a higher-dimensional space where it becomes easier to separate the
data points linearly. Essential parameters include C, the regularization parameter indicating
the trade-off between smooth decision boundaries and classifying training points correctly;
kernel, which can be linear, polynomial, RBF, or sigmoid, fundamentally altering the deci-
sion function’s shape; and gamma in non-linear kernels, determining the influence radius of
a single training example, with higher values potentially leading to overfitting.
AdaBoost (Adaptive Boosting) operates by combining multiple weak learners, typi-
cally decision trees, into a strong classifier through a sequential process where subsequent
weak learners correct the misclassifications of their predecessors. Its performance is tuned
through parameters like n_estimators, dictating the number of weak learners to use; learn-
ing_rate, adjusting the contribution of each learner, with lower rates requiring more estima-
tors but potentially improving generalization; and base_estimator, defining the type of weak
learner model, with the default being a depth-one decision tree (Plaia et al. 2022).
XGBoost (Extreme Gradient Boosting), an optimized distributed gradient boosting
library, enhances the basic boosting technique’s speed and performance (Wang et al. 2023).
It is renowned for its efficiency in handling sparse data, implementing tree pruning, and
regularization to prevent overfitting. Critical parameters include n_estimators and learning_
rate, controlling the number of boosting rounds and the step size shrinkage, respectively;
max_depth, which limits tree depth for complexity control; subsample and colsample_
bytree, dictating the fraction of samples and features used per tree, enhancing robustness;
and lambda and alpha, providing L2 and L1 regularization on weights, respectively, further
aiding in model simplification and performance (Wang et al. 2023).

2.6 Evaluation metrics

In this study, the classifiers’ performance was evaluated using several key metrics derived
from the confusion matrix, including True Positives (TP), False Positives (FP), True
Negatives (TN), and False Negatives (FN). Table 3 provides a detailed list of the metrics
employed to assess the effectiveness of the ML models developed in this study. These met-
rics collectively offer a robust framework for evaluating the overall performance and reli-
ability of the classification models.

Table 3 Metrics used in the study


Metrics Formula Evaluation
Accuracy T P +T N Measures the overall correctness of the classifier, indicat-
T P +F N +f p+T N
ing total efficacy.
Precision TP Ensure consistency between the class labels in the input
T P +F P
data and the positive labels identified by the classifier.
Recall TP Efficacy of a classifier to find positive labels.
T P +F N
F1-Score 2* Precision * Recall The association between positive labels in the input data
Precision + Recall
and those allocated by a classifier.
AUC 1
( ) The ability of a classifier to minimize incorrect
2
TP
T P +F N
+ TN
T N +F P
classifications.
Accuracy - Averages accuracy across multiple folds in cross-valida-
(CV) tion, indicating the model’s generalization capability.

13
15344 Natural Hazards (2025) 121:15331–15357

3 Results

3.1 Correlation analysis of independent factors

Table 4 presents the Variance Inflation Factor (VIF) and Tolerance values calculated in
this study. The highest VIF observed was 12.36201 for temperature, with a corresponding
Tolerance of 0.097976, indicating multicollinearity. Despite this, multicollinearity was not a
primary concern for most of the independent factors, as all other Tolerance values remained
above 0.1, and only the temperature’s VIF exceeded the threshold of 10. Temperature was
retained in the analysis due to its crucial role in wildfire susceptibility mapping. As a key
environmental factor influencing wildfire ignition and propagation, excluding temperature
could lead to an incomplete assessment of wildfire risk, diminishing the model’s practical
utility for wildfire management. Furthermore, the heatmap of Spearman’s correlation coef-
ficients (Fig. 9) reveals that temperature is not highly correlated with other factors. This
suggests that the elevated VIF may not significantly compromise the model’s stability. The
analysis ensures that all relevant environmental factors are considered by retaining tem-
perature, resulting in a more robust and comprehensive susceptibility map. The heatmap
in Fig. 9 also highlights the relationship between the number of fires (dependent variable)
and various wildfire factors (independent factors). Precipitation exhibits a relatively high
correlation with the target variable at 0.72, followed closely by NDVI and Soil Moisture,
both at 0.69. Additionally, the correlation between Precipitation and Soil Moisture is robust,
reaching a value of 0.91. Although this is the highest observed correlation, it remains below
the exclusion threshold of 0.95, ensuring no factors were excluded based on the correlation
analysis. As a result, all factors were retained for further analysis, thereby maintaining the
model’s comprehensiveness and ability to reflect the factors influencing wildfire susceptibil-
ity accurately.

Table 4 Multicollinearity Feature VIF Tolerance


analysis for wildfire independent
Aspect 1.052173 0.950414
factors
Distance From Road 1.286389 0.77737
Drought Severity Index 2.274531 0.439651
Soil Moisture 3.990696 0.250583
Precipitation 9.789944 0.142146
Electric Network 1.013269 0.986905
Elevation 4.719203 0.2119
GHM 1.705182 0.586448
Land Cover 1.755837 0.569529
NDVI 5.0136 0.199457
Population 1.104477 0.905406
Slope 2.210693 0.452347
Soil Depth 1.773432 0.563879
Wind Speed 3.724423 0.268498
Temperature 12.36201 0.097976

13
Natural Hazards (2025) 121:15331–15357 15345

Fig. 9 Figure Heat map of Spearman’s correlation coefficients between pairs of factors

Table 5 Evaluation of the prediction performance of the four ML models


Models Evaluation Metrics Confusion Matrix
Precision Recall F1 Kappa Accuracy Accuracy (CV) TP TN FP FN
Random Forest 0.945 0.975 0.960 0.926 0.945 0.958 1386 1158 67 30
Ada Boost 0.942 0.949 0.945 0.900 0.942 0.949 1383 1127 70 61
XGBoost 0.968 0.963 0.965 0.937 0.968 0.974 1415 1144 38 44
SVM 0.852 0.850 0.851 0.730 0.852 0.866 1278 1010 175 178

3.2 Performance of the ML models

The performance of four ML models compared across multiple metrics, a clear hierarchy
emerges, with XGBoost demonstrating superior predictive capabilities. Table 5 presents the
performance metrics of all four models. XGBoost performed better than the other models,
achieving the highest scores in Precision at 0.968, Recall at 0.963, F1 Score at 0.965, Kappa
Statistic at 0.937, and Accuracy at 0.968. Furthermore, its Cross-Validated Accuracy of
0.974 underscores its robustness and strong generalization capability to unseen data. Ran-
dom Forest (RF) followed closely behind XGBoost, particularly excelling in Recall at 0.975,
highlighting its effectiveness in identifying positive cases. AdaBoost also performed well,

13
15346 Natural Hazards (2025) 121:15331–15357

demonstrating proficiency in identifying positive cases with Precision at 0.942 and Recall at
0.949. In contrast, SVM lagged behind the ensemble models, showing comparatively lower
Precision at 0.852 and Recall at 0.850. Examining the confusion matrices provides further
insight into the models’ performance. XGBoost demonstrated the highest effectiveness in
correctly classifying instances, with the fewest false positives and false negatives.
RF also performed strongly, with a marginally lower precision but higher recall than
XGBoost. Ada Boost exhibited a balance between precision and recall, while SVM showed
significantly higher false positives and false negatives, reflecting lower precision and recall.
Ensemble methods, particularly XGBoost and RF, outperform traditional SVM in various
classification metrics. The choice between these models should consider specific application
requirements, including precision, recall, misclassification costs, and model interpretability.
Figure 10 depicts the bar chart of the model’s performance.

3.3 Susceptibility map

A wildfire susceptibility map for the study area was generated using the XGBoost model
and a training dataset to predict wildfires in NSW, Australia, during the 2019–2020 season.
Figure 11 depicts fire risk classes for the selected area categorized into five levels. The maps
indicate a concentrated high risk of fire occurrence in the coastal area of the study region.
Additionally, fire-prone zones can be observed scattered throughout the northern coastal
regions. The distribution of each wildfire susceptibility class across the study area reveals
that zones categorized as very low to low susceptibility encompass approximately 77% of
the study area. In comparison, about 23% of the land is classified as high to very high in
terms of wildfire susceptibility. The moderate susceptibility zone constitutes roughly 9% of
the study area.

Fig. 10 Bar chart of the evaluation of the prediction performance of the four ML models

13
Natural Hazards (2025) 121:15331–15357 15347

Fig. 11 Wildfire susceptibility maps produced using the XGBoost model

Fig. 12 SHAP values of the importance of each feature (left), the impact of each feature (right)

3.4 Model explainability

We chose the XGBoost model to perform xAI analysis and interpret what the model learned.
Its well-balanced performance across all evaluation metrics justifies this choice. To aggre-
gate the entire dataset into a single visual representation and calculate the mean absolute
SHAP values for each feature, we employed a standard bar graph in Fig. 12 (left). On the
X-axis, these SHAP values represent the degree of change in log odds. In this case, every

13
15348 Natural Hazards (2025) 121:15331–15357

feature is continuous, and a vertical hierarchy arranges its mean effect on the classification
outcome. These rankings place the feature with the least impact at the bottom and the one
with the most significant impact at the top, clearly delineating each feature’s contribution
to the predictive model. The bar chart indicates that precipitation, drought conditions, soil
moisture, and the NDVI are the most significant predictors.
A summary plot was also developed to integrate the significance of features with their
respective impacts. This plot displays a dot for every feature and data point, where each
dot corresponds to the Shapley value for that feature in the given sample. As presented in
Fig. 12 (right), the Y-axis lists the features, while the X-axis quantifies the Shapley values.
The color gradient represents the value of the feature, transitioning from low to high. Points
plotted further to the right on the X-axis indicate a positive contribution, with red tones sig-
nifying higher feature values. The features are ranked vertically by their mean impact on the
model’s predictions, as established by (Ribeiro et al. 2016). The spread of the dots along the
Y-axis, achieved through jittering, illustrates the distribution of the Shapley values for each
feature. From this visualization, it is evident that humidity is the most impactful feature on
the predictions, with higher humidity levels generally leading to more negative effects on
them. In contrast, the presence of the target factor and increased wind speed are associated
with a positive influence on the predictions.
To enhance our comprehension of how different factors interplay in predicting wildfires,
we used SHAP values to visualize their interactions in Fig. 13. This visualization showcases
the interaction of Precipitation in Drought Figure (left) and Soil Moisture with NDVI in
Fig. 6 (right). The intensity of the SHAP values for each variable is represented by color,
while the X and Y axes display the respective magnitudes of these factors. We observed the
effect of Precipitation on Drought across a range from − 0.20 to -0.04 and the impact of Soil
Moisture on NDVI across a range from 0.3 to 0.9. The colors red and blue indicate high
and low values for Drought and NDVI, respectively. Specifically, Fig. 6 (left) indicates that
SHAP values for precipitation drop below 0 when precipitation is under 0.004; this combi-
nation of low Precipitation and Drought leads to significantly lower SHAP values, thereby
increasing the predicted risk of wildfires.
Figure 4 introduces a waterfall plot aimed at breaking down individual predictions. Here,
negative values indicate a reverse relationship with the propensity for wildfires, and posi-
tive values denote a direct correlation as determined by the model. For instance, a decrease
in rainfall is linked to an increased fire risk. The influence of factors on wildfire risk is
underscored by their magnitude, with precipitation, drought, and NDVI having significant

Fig. 13 The proposed method’s SHAP dependence plots on (left) precipitation and drought and (right) soil
moisture and NDVI for wildfire prediction

13
Natural Hazards (2025) 121:15331–15357 15349

weights of + 4.53, + 3.38, and + 1.87, respectively, indicating a strong link to wildfire occur-
rences. In essence, variations in these elements are closely tied to the likelihood of wildfire
events. Factors like electric network, land cover, wind speed, and aspect show the least
influence. In contrast, soil moisture, temperature, GHM, and distance from the road exhibit
a moderate association with wildfire risk. In Fig. 15, a force plot delineates how various
features contribute to shifting the model’s prediction from a base value toward the predicted
outcome. Red indicates features associated with an increase in the prediction value, while
blue signifies features contributing to a decrease. For example, the model’s forecast is 1.00
against a baseline of 0.4617 in predicting wildfire occurrence. Features like high soil depth,
wind speed, and slope, along with temperature and soil moisture, are factors that elevate the
prediction value. Conversely, high precipitation levels tend to lower the forecast.
Figure 4 introduces a waterfall plot aimed at breaking down individual predictions. Here,
negative values indicate a reverse relationship with the propensity for wildfires, and posi-
tive values denote a direct correlation as determined by the model. For instance, a decrease
in rainfall is linked to an increased fire risk. The influence of factors on wildfire risk is
underscored by their magnitude, with precipitation, drought, and NDVI having significant
weights of + 4.53, + 3.38, and + 1.87, respectively, indicating a strong link to wildfire occur-
rences. In essence, variations in these elements are closely tied to the likelihood of wildfire
events. Factors like electric network, land cover, wind speed, and aspect show the least
influence. In contrast, soil moisture, temperature, GHM, and distance from the road exhibit
a moderate association with wildfire risk. In Fig. 14, a force plot delineates how various
features contribute to shifting the model’s prediction from a base value toward the predicted
outcome. Red indicates features associated with an increase in the prediction value, while
blue signifies features contributing to a decrease. For example, the model’s forecast is 1.00
against a baseline of 0.4617 in predicting wildfire occurrence. Features like high soil depth,
wind speed, and slope, along with temperature and soil moisture, are factors that elevate the
prediction value. Conversely, high precipitation levels tend to lower the forecast.

4 Discussion

Wildfires seriously endanger both the environment and socio-economic activities in human
life (Patel et al. 2021). In Australia, they seriously harm croplands and natural woodlands,
especially in NSW (Hosseini and Lim 2022). This area has seen numerous wildfires in
recent decades, which has had a significant negative impact on the environment, the econ-
omy, and human population (Gill et al. 2015). As such, modeling this region’s wildfire
susceptibility has become imperative. This study employed four ML models to predict
wildfire susceptibility in the NSW region. MLAs have grown in popularity as one of the
most effective techniques for wildfire susceptibility (Achu et al. 2021; Le et al. 2021). In
the present study, performances in predicting wildfire susceptibility have been evaluated

Fig. 14 The local interpretation of the model’s prediction is visualized using a SHAP force plot, where
red feature attributions increase the prediction above the “base value,” and blue attributions decrease it

13
15350 Natural Hazards (2025) 121:15331–15357

and compared between stand-alone classifier SVM and tree-based ensemble classifiers (RF,
XGBoost, Ada Boost) for the study area when mapping fire susceptibility. Based on the
quantitative analysis of the classification performance, XGBoost showed the best values
in all performance measures compared to other classifiers. Other researchers also testified
that the XGBoost and RF model performs very well in their fire susceptibility studies (Cao
et al. 2017; Pouyan et al. 2021). Additionally, this study proves that ensemble learners sur-
pass stand-alone classifiers across all performance metrics, which aligns with the findings
(Sachdeva et al. 2018), (Hong et al. 2018). Therefore, it may not be relevant to assess the
performance of these two ensemble algorithms in future research. Conversely, the RF and
Ada Boost classifiers demonstrated strong predictive capabilities, showcasing promise with
their boosted tree structure. The predictive accuracy of the XGBoost model in the present
study was deemed satisfactory, as the confusion matrix indicated that only 38 false positives
and 44 false negatives samples were incorrectly classified. Consequently, this model was
utilized to create a susceptibility map that illustrates the spatial probability of different areas.
Specifically, the map indicates the likelihood of each pixel burning based on assumptions
derived from independent factors unique to each area. A key benefit of this model is its flex-
ibility in integrating various causal factors seamlessly. The XGBoost model demonstrated
the best performance and facilitated variable importance analysis.
Our research enhances the current understanding of WSM by not only validating the
effectiveness of ML models, as shown in previous studies, but also providing an in-depth
comparative evaluation using various performance metrics. This approach provides a more
nuanced understanding of how these models perform and their reliability in practical scenar-
ios. By employing a thorough methodology, we ensure that our results are robust and appli-
cable to real-world wildfire prevention strategies, facilitating more effective and informed
decision-making processes. The predictions from all four models indicated that approxi-
mately 9.86% of the total area is categorized as ‘very high’ wildfire-prone and 6.05% as
‘high’ wildfire-prone. These predicted zones are predominantly located in coastal forested
regions in NSW, grasslands, and agricultural areas. These results align with the findings
of previous studies (Deb et al. 2020; Hosseini and Lim 2022). Similar observations were
made concerning wildfire susceptibility in protected forests, such as National Parks and
state forests, which fall within the ‘high’ and ‘very high’ susceptibility zones. The suscep-
tibility to wildfires is influenced by various independent factors, including topographical,
climatic, and anthropogenic (Abujayyab et al. 2022). The complex interplay of various fac-
tors, including topography, climate, and human activity, significantly influences an area’s
vulnerability to wildfires (Shahfahad et al. 2022). This research underscores the importance
of understanding these interactions to improve predictive accuracy and develop more effec-
tive wildfire management strategies.
Understanding the key influential factors is crucial for developing accurate predictive
models and effective mitigation strategies. This study used XAI techniques to explain the
in-depth view of the wildfire susceptibility prediction model and highlight the importance
of different interpretability. In this study, we observed that high precipitation contributes to
wildfire risk, which contradicts established wildfire dynamics, where low precipitation is
typically associated with increased fire danger. Similar findings were reported by (Abdol-
lahi and Pradhan 2023), who identified a positive relationship between high rainfall and
wildfire susceptibility in Victoria, Australia. Furthermore, their study also highlighted high
humidity as a contributing factor to wildfire risk, a result that contrasts with conventional

13
Natural Hazards (2025) 121:15331–15357 15351

wildfire behavior. Drought emerged as the second most influential factor in our analysis,
exhibiting positive and negative effects. While our findings did not show a direct correlation
between high drought conditions and increased wildfire risk, aligning with the observations
of (Shmuel and Heifetz 2022). However, the significance of drought in wildfire occurrence
has been extensively studied. For instance, the study (Filkov et al. 2020) demonstrated that
prolonged drought conditions lead to drier vegetation, which acts as a catalyst for ignition
and propagation. Soil moisture and NDVI were identified as the following most important
factors. The influence of soil moisture on wildfire danger has been extensively discussed by
(Haydar et al. 2024), who highlighted that low soil moisture levels indicate dry fuel condi-
tions, increasing fire risk. Similarly, NDVI, a measure of vegetation greenness, plays a criti-
cal role in wildfire dynamics. While studies such as (Iban and Sekertekin 2022; Moayedi
and Khasmakhi 2023) found that lower NDVI values correlate with higher wildfire risk due
to reduced vegetation cover, in contrast, we did not find this relationship in our analysis.
Instead, our results indicate that high NDVI values are associated with increased wildfire
danger, likely due to enhanced vegetation growth. These findings align with (Tran et al.
2024), who reported similar trends. Moderately influential factors in our study included
GHM, temperature, population density, soil depth, and wind speed. The role of human activ-
ities in wildfire ignition is supported by (Iban and Sekertekin 2022), who found that human
presence and landscape modifications significantly increase fire incidence. Temperature, a
critical climate variable, has long been associated with wildfire behavior, as evidenced by
(Shmuel and Heifetz 2023), who found that higher temperatures correlate with increased
fire risk. However, our analysis suggests that lower temperatures may also contribute to
wildfire risk, aligning with (Abdollahi and Pradhan 2023; Iban and Sekertekin 2022), who
reported similar findings. However, (Tuyen et al. 2021) reported that temperature alone does
not directly affect the occurrence of wildfires at a regional scale. Population density, soil
depth, and wind speed were also found to be influential, emphasizing these factors’ role in
fire dynamics (Shahfahad et al. 2022). Less influential factors identified in our study were
elevation, land cover, distance from road, slope, aspect, and the electric network. While
these factors were less significant in our models, they are still relevant in specific con-
texts. For instance, elevation and slope have been highlighted by studies such as those by
(Thi Hang et al. 2024), which found that topography can moderately influence fire spread
patterns. Our findings, which indicate that land cover types and their flammability charac-
teristics have minimal impact, align with the results n (Abdollahi and Pradhan 2023), and
proximity to roads has been linked to increased fire ignitions (Tran et al. 2024).
Our analysis revealed significant findings regarding the interaction of different factors.
The SHAP dual Partial Dependency Plots showed that Precipitation and Soil Moisture have
the most influence on Drought and NDVI, respectively, thus also giving insights into their
combined effect on the risk of wildfires. For instance, low precipitation in conjunction with
high levels of drought significantly raises the risk of wildfire. Additionally, the waterfall
plot, as shown in Fig. 15, further demonstrated the relative importance of various factors.
Precipitation, drought, and NDVI had the highest weights, underscoring their substantial
role in predicting wildfire risk. Conversely, factors such as wind speed, aspect, and electric
networks showed less influence, reflecting the need to prioritize resources based on the most
impactful factors. The force plot analysis shows how different features contribute to shifting
the model’s predictions, as depicted in Fig. 15. High soil depth, wind speed, and temperature
increased the prediction value, with high precipitation levels depressing it. This nuanced

13
15352 Natural Hazards (2025) 121:15331–15357

Fig. 15 Displays a waterfall plot that visualizes explanations for individual predictions

understanding aligns with the work of (Haydar et al. 2024) and others who explored the
interaction between environmental factors and wildfire risk. Our study furthers predictive
accuracy and model interpretability by incorporating XAI techniques, both critical elements
in wildfire risk management. The approach fills a lacuna gap in existing research by trying
to give a more detailed and actionable understanding of the factors that might be driving
wildfire susceptibility to better guide decision-making and management strategies.
Management strategies based on the study’s findings can be effectively formulated to
mitigate wildfire risks. The major factors that are most influencing are precipitation, drought,
and then followed by soil moisture, NDVI, and temperature. For these factors, different
kinds of proactive management strategies can be drawn, which include fire breaks, vegeta-
tion planning, control burning, and dealing with the adaptation to change in climate. It will
also aid in increasing fire suppression and early warning systems, which would have the
potential to reduce the increased intensity of wildfires (Iban and Sekertekin 2022). SHAP
summary plots provided information regarding influential parameters for our model. Pre-
cipitation, drought, soil moisture, and NDVI played a significant role in our model, whereas
slope, aspect, and electric network played a much lesser role. These wildfire factors could
form the basis of targeted management regimes through road construction, land-use plan-
ning, and public awareness campaigns in areas deemed to be at high susceptibility to wild-
fires. It also points toward continued long-term climate monitoring and related adaptive
requirements, which adjust fire management actions dynamically in response to seasonal
and annual precipitation patterns. Strong interaction among soil moisture, NDVI, and fire

13
Natural Hazards (2025) 121:15331–15357 15353

risk supports that vegetation management via controlled burns and mechanical thinning
would continuously play a key role in keeping fire risk minimum in the area. However, the
effectiveness of these actions could change with climate change; in return, adaptive man-
agement strategies would have to be followed. This suggests an additional need for public
education on fire safety and more stringent regulations on wildland-urban interface zones,
as human factors are moderately dominant in the model. In addition, it has been pointed out
that planning of land use should put into consideration wildfire risk assessment against a
wide range of environmental and anthropogenic factors (Filkov et al. 2020). The analysis
also highlighted specific interactions between factors, such as the increased susceptibility
to moderate temperature and low wind speed. These findings can shape forest preserva-
tion tactics, directing efforts towards susceptible zones during specific climatic conditions.
More succinctly, this helps the forest management authorities focus on which resources
and best efforts should be placed in preventing and controlling wildfires by considering
these relationships between complicated factors. In summary, the use of MLAs coupled
with XAI provides insight into wildfire risk management in NSW. Understanding the factors
that influence wildfire susceptibility, and their interactions allows the development of appro-
priate proactive strategies for safeguarding the natural environment, economic pursuits, and
community well-being in the area.
The current study acknowledges several limitations that warrant consideration. Firstly,
XGBoost, RF, and AdaBoost are widely implemented models and can produce reliable
results, exploring more advanced deep learning models might enhance predictive accuracy.
Additionally, the study’s use of 10-resolution Sentinel-2 MSI fire point data suggests that
future research could benefit from utilizing finer 5-meter resolution UAV fire point data-
sets for improved spatial precision. The lack of proper field surveys during fire incidents
may have also limited the depth of understanding of fire dynamics. Also, knowledge of fire
dynamics may have been limited due to a lack of proper field surveys during fire incidents.
Moreover, implementing additional validation strategies beyond the ones carried out in this
study, including Threat Score (TS) and Matthews Correlation Coefficient, might allow for a
more complete assessment of ML models’ performances. Finally, this study did not include
influential factors such as the Temperature Vegetation Dryness Index, Enhanced Vegetation
Index (EVI), and Soil Moisture Index, suggesting potential areas for further investigation
in future research.

5 Conclusion

The present study aimed to develop a wildfire susceptibility map for the state of NSW using
RS techniques and four ML models, including XGBoost, RF, AdaBoost, and SVM, and
compare their performance. Fifteen influential geo-environmental factors were employed
to model wildfire susceptibility. Evaluation of the ML models using six performance crite-
ria revealed F1 values exceeding 0.8, indicating their strong predictive ability in identify-
ing wildfire-prone areas. Particularly, the XGBoost model showed higher F1 score, Kappa,
accuracy, and cross-validation accuracy of 0.965, 0.937, 0.968, and 0.974, respectively,
compared to others. The “very high” and “high” susceptibility zones of wildfire are found
majorly in the coastal forested regions in NSW, grasslands, and agricultural lands. In con-
trast, the “very low” and “low” wildfire susceptibility zones are found majorly concentrated

13
15354 Natural Hazards (2025) 121:15331–15357

in dry scrubland districts in the northern and western parts of the region. Additionally, the
SHAP framework was utilized to interpret ML model results, shedding light on factors
contributing to wildfire likelihood, with factors such as precipitation, drought, NDVI, and
soil moisture emerging as significant predictors. The electric network is variable and of the
least importance. In summary, this research endeavor is anticipated to aid in developing a
susceptibility map for the study area, considering pertinent predictor factors. The current
research findings will be helpful in the state forest agency and other departments working
on wildfires and forest management in reducing the risk of fires. Further, this study may also
help develop effective planning and policies to prevent loss arising out of wildfires in the
state. Furthermore, there is scope for extending such efforts to conduct forest fire suscepti-
bility studies nationally, thereby mitigating the threat posed by forest fires to ecosystems and
protected areas. This proactive approach holds promise for safeguarding forest biodiversity,
particularly in areas globally recognized as biodiversity.

Author Contribution Rufai Yusuf Zakari: Conceptualization, Methodology, Writing– original draft. Owais
Ahmed Malik: Conceptualization, Project administration, Supervision, Writing– review & editing. Ong Wee
Hong: Supervision, Validation, Writing– review & editing.

Funding This work was supported by Universiti Brunei Darussalam under research grant number UBD/
RSCH/1.18/FICBF(a)/2022/006.

Declarations

Consent to participate The author(s) declared approval of participation.

Consent for publication The author(s) declared approval of publication

Conflict of interest All authors declare that they have no conflict of interest.

References
Abatzoglou JT, Dobrowski SZ, Parks SA, Hegewisch KC (2018) TerraClimate, a high-resolution global
dataset of monthly climate and climatic water balance from 1958–2015. Sci Data 5. ​h​t​t​p​s​:​/​/​d​o​i​.​o​r​g​/​1​0​.​
1​0​3​8​/​s​d​a​ta​ ​.​2​0​1​7​.​1​9​1​​​​
Abdollahi A, Pradhan B (2023) Explainable artificial intelligence (XAI) for interpreting the contributing fac-
tors feed into the wildfire susceptibility prediction model. Sci Total Environ 879. ​h​t​t​p​s​:​​/​/​d​o​i​​.​o​r​g​/​1​​0​.​1​0​​
1​6​/​j​.​​s​c​i​t​o​​t​e​n​v​.​2​​0​2​3​.​​1​6​3​0​0​4
Abujayyab SKM, Kassem MM, Khan AA, Wazirali R, Coşkun M, Taşoǧlu E, Öztürk A, Toprak F (2022)
Wildfire susceptibility mapping using five boosting machine learning algorithms: the case study of the
mediterranean region of Turkey. Genet Res 2022. https://doi.org/10.1155/2022/3959150
Achu AL, Thomas J, Aju CD, Gopinath G, Kumar S, Reghunath R (2021) Machine-learning modelling of
fire susceptibility in a forest-agriculture mosaic landscape of Southern India. Ecol Inf 64. ​h​t​t​p​s​:​/​/​d​o​i​.​o​r​
g​/​1​0​.​1​0​1​6​/​j​.​e​co​ ​i​n​f​.​2​0​2​1​.​1​0​1​3​4​8​​​​
Ahmed IA, Talukdar S, Shahfahad, Parvez A, Rihan M, Baig MRI, Rahman A (2022) Flood susceptibility
modeling in the urban watershed of Guwahati using improved metaheuristic-based ensemble machine
learning algorithms. Geocarto Int 37(26). ​h​t​t​p​s​:​​/​/​d​o​i​​.​o​r​g​/​1​​0​.​1​0​​8​0​/​1​0​​1​0​6​0​4​​9​.​2​0​2​2​​.​2​0​6​​6​2​0​0
Alqadhi S, Mallick J, Talukdar S, Bindajam AA, Saha TK, Ahmed M, Khan RA (2022) Combining logistic
regression-based hybrid optimized machine learning algorithms with sensitivity analysis to achieve
robust landslide susceptibility mapping. Geocarto Int 37(25). ​h​t​t​p​s​:​​/​/​d​o​i​​.​o​r​g​/​1​​0​.​1​0​​8​0​/​1​0​​1​0​6​0​4​​9​.​2​0​2​1​​
.​2​0​2​​2​0​0​9

13
Natural Hazards (2025) 121:15331–15357 15355

Alqadhi S, Mallick J, Talukdar S, Alkahtani M (2023) An artificial intelligence-based assessment of soil ero-
sion probability indices and contributing factors in the Abha-Khamis watershed, Saudi Arabia. Front
Ecol Evol 11. https://doi.org/10.3389/fevo.2023.1189184
Beltrán-Marcos D, Calvo L, Fernández-Guisuraga JM, Fernández-García V, Suárez-Seoane S (2023) Wild-
land-urban interface typologies prone to high severity fires in Spain. Sci Total Environ 894. ​h​t​t​p​s​:​​/​/​d​o​i​​.​
o​r​g​/​1​​0​.​1​0​​1​6​/​j​.​​s​c​i​t​o​​t​e​n​v​.​2​​0​2​3​.​​1​6​5​0​0​0
Bretreger D, Hancock GR, Lowry J, Senanayake IP, Yeo IY (2024) The impacts of burn severity and frequency
on Erosion in Western Arnhem land, Australia. Sensors 24(7). https://doi.org/10.3390/s24072282
Cao Y, Wang M, Liu K (2017) Wildfire susceptibility assessment in Southern China: A comparison of mul-
tiple methods. Int J Disaster Risk Sci 8(2). https://doi.org/10.1007/s13753-017-0129-6
Deb P, Moradkhani H, Abbaszadeh P, Kiem AS, Engström J, Keellings D, Sharma A (2020) Causes of the
widespread 2019–2020 Australian bushfire season. Earth’s Future 8(11). ​h​t​t​p​s​:​/​/​d​o​i​.​o​r​g​/​1​0​.​1​0​2​9​/​2​0​2​0​
E​F​0​0​1​6​7​1​​​​
Debebe B, Senbeta F, Teferi E, Diriba D, Teketay D (2023) Analysis of forest cover change and its drivers in
biodiversity hotspot areas of the semien mountains National park, Northwest Ethiopia. Sustain (Swit-
zerland) 15(4). https://doi.org/10.3390/su15043001
Filkov AI, Ngo T, Matthews S, Telfer S, Penman TD (2020) Impact of Australia’s catastrophic 2019/20 bush-
fire season on communities and environment. Retrospective analysis and current trends. J Saf Sci Resil
1(1). https://doi.org/10.1016/j.jnlssr.2020.06.009
Gill N, Dun O, Brennan-Horley C, Eriksen C (2015) Landscape preferences, amenity, and bushfire risk in
new South Wales, Australia. Environ Manage 56(3). https://doi.org/10.1007/s00267-015-0525-x
Gragnaniello D, Greco A, Sansone C, Vento B (2024) Fire and smoke detection from videos: A literature review
under a novel taxonomy. Expert Syst Appl 255:124783. https://doi.org/10.1016/J.ESWA.2024.124783
Haque MK, Azad MAK, Hossain MY, Ahmed T, Uddin M, Hossain MM (2021) Wildfire in Australia during
2019–2020, its impact on health, biodiversity and environment with some proposals for risk manage-
ment: A review. J Environ Prot 12(06). https://doi.org/10.4236/jep.2021.126024
Haydar M, Hossain Rafi A, Sadia H, Tanvir Hossain M (2024) Data driven forest fire susceptibility mapping
in Bangladesh. Ecol Ind 112264. ​h​t​t​p​s​:​​/​/​d​o​i​​.​o​r​g​/​1​​0​.​1​0​​1​6​/​J​.​​E​C​O​L​I​​N​D​.​2​0​2​​4​.​1​1​​2​2​6​4
Hong H, Tsangaratos P, Ilia I, Liu J, Zhu AX, Xu C (2018) Applying genetic algorithms to set the optimal
combination of forest fire related variables and model forest fire susceptibility based on data mining
models. The case of Dayu County, China. Sci Total Environ 630. ​h​t​t​p​s​:​​/​/​d​o​i​​.​o​r​g​/​1​​0​.​1​0​​1​6​/​j​.​​s​c​i​t​o​​t​e​n​v​.​2​​
0​1​8​.​​0​2​.​2​7​8
Hosseini M, Lim S (2022) Gene expression programming and data mining methods for bushfire susceptibil-
ity mapping in new South Wales, Australia. Nat Hazards 113(2). ​h​t​t​p​s​:​/​/​d​o​i​.​o​r​g​/​1​0​.​1​0​0​7​/​s​1​1​0​6​9​-​0​2​2​-​0​
5​3​5​0​-​7​​​​
Iban MC, Sekertekin A (2022) Machine learning based wildfire susceptibility mapping using remotely sensed
fire data and GIS: A case study of Adana and Mersin provinces, Turkey. Ecol Inform 69. ​h​t​t​p​s​:​/​/​d​o​i​.​o​r​g​
/​1​0​.​1​0​1​6​/​j.​​ec​ ​o​i​n​f​.​2​0​2​2​.​1​0​1​6​4​7​​​​
Jessie Yeung (2020 January 13) Australia wildfires: Here’s what you need to know about the deadly blazes.
CNN. ​h​t​t​p​s​:​​/​/​e​d​i​​t​i​o​n​.​c​​n​n​.​c​​o​m​/​2​0​​2​0​/​0​1​​/​0​1​/​a​u​​s​t​r​a​​l​i​a​/​a​​u​s​t​r​a​​l​i​a​-​f​i​​r​e​s​-​​e​x​p​l​a​​i​n​e​r​-​​i​n​t​l​-​h​​n​k​-​s​​c​l​i​/​i​n​d​e​x​.​h​t​m​l
John Roach (2020 January 8) Australia wildfire economic damages and losses to reach $110 billion. Accu-
Weather. ​h​t​t​p​s​:​​/​/​w​w​w​​.​a​c​c​u​w​​e​a​t​h​​e​r​.​c​o​​m​/​e​n​/​​b​u​s​i​n​e​​s​s​/​a​​u​s​t​r​a​​l​i​a​-​w​​i​l​d​f​i​r​​e​-​e​c​​o​n​o​m​i​​c​-​d​a​m​​a​g​e​s​-​a​​n​d​-​l​​o​s​s​e​s​​-​t​
o​-​r​​e​a​c​h​-​1​​1​0​-​b​​i​l​l​i​o​n​/​6​5​7​2​3​5
Laudari HK, Pariyar S, Maraseni T (2021) COVID-19 lockdown and the forestry sector: Insight from Gan-
daki province of Nepal. In For Policy Econ 131.https://doi.org/10.1016/j.forpol.2021.102556
Le H, Van, Hoang DA, Tran CT, Nguyen PQ, Tran VHT, Hoang ND, Amiri M, Ngo TPT, Nhu HV, Hoang
T, Van, Tien Bui D (2021) A new approach of deep neural computing for spatial prediction of wildfire
danger at tropical climate areas. Ecol Inform 63. https://doi.org/10.1016/j.ecoinf.2021.101300
Llorens R, Sobrino JA, Fernández C, Fernández-Alonso JM, Vega JA (2021) A methodology to estimate
forest fires burned areas and burn severity degrees using Sentinel-2 data. Int J Appl Earth Obs Geo-
inf 95. https://doi.org/10.1016/j.jag.2020.102243. Application to the October 2017 fires in the Iberian
Peninsula
Ma J, Cheng JCP, Jiang F, Gan VJL, Wang M, Zhai C (2020) Real-time detection of wildfire risk caused by
powerline vegetation faults using advanced machine learning techniques. Adv Eng Inform 44. ​h​t​t​p​s​:​/​/​d​
o​i​.​o​r​g​/​1​0​.​1​0​1​6​/​j​.​a​e​i​.​2​0​2​0​.​1​0​1​0​7​0​​​​
Moayedi H, Khasmakhi MASA (2023) Wildfire susceptibility mapping using two empowered machine learn-
ing algorithms. Stoch Env Res Risk Assess 37(1). https://doi.org/10.1007/s00477-022-02273-4
Moghim S, Mehrabi M (2024) Wildfire assessment using machine learning algorithms in different regions.
Fire Ecol 20(1). https://doi.org/10.1186/s42408-024-00335-2

13
15356 Natural Hazards (2025) 121:15331–15357

Nguyen HD, Azzi M, White S, Salter D, Trieu T, Morgan G, Rahman M, Watt S, Riley M, Chang LTC,
Barthelemy X, Fuchs D, Lieschke K, Nguyen H (2021) The summer 2019–2020 wildfires in East Coast
Australia and their impacts on air quality and health in new South Wales, Australia. Int J Environ Res
Public Health 18(7). https://doi.org/10.3390/ijerph18073538
Patel S, Dey A, Singh SK, Singh R, Singh HP (2021) Socio-economic impacts of climate change. In Climate
Impacts on Sustainable Natural Resource Management. https://doi.org/10.1002/9781119793403.ch12
Plaia A, Buscemi S, Fürnkranz J, Mencía EL (2022) Comparing boosting and bagging for decision trees of
rankings. J Classif 39(1). https://doi.org/10.1007/s00357-021-09397-2
Pourghasemi HR, Pouyan S, Bordbar M, Golkar F, Clague JJ (2023) Flood, landslides, forest fire, and earth-
quake susceptibility maps using machine learning techniques and their combination. Nat Hazards
116(3). https://doi.org/10.1007/s11069-023-05836-y
Pouyan S, Pourghasemi HR, Bordbar M, Rahmanian S, Clague JJ (2021) A multi-hazard map-based flood-
ing, gully erosion, forest fires, and earthquakes in Iran. Sci Rep 11(1). ​h​t​t​p​s​:​/​/​d​o​i​.​o​r​g​/​1​0​.​1​0​3​8​/​s​4​1​5​9​8​-​0​
2​1​-​9​4​2​6​6​-​6​​​​
Ribeiro MT, Singh S, Guestrin C (2016) why should i trust you? explaining the predictions of any classifier.
NAACL-HLT 2016–2016 Conference of the North American Chapter of the Association for Computa-
tional Linguistics: Human Language Technologies, Proceedings of the Demonstrations Session. ​h​t​t​p​s​:​/​
/​d​o​i​.​o​r​g​/​1​0​.​1​8​6​5​3​/​v​1​/​n​1​6​-​3​0​2​0​​​​
Rihan M, Ali Bindajam A, Talukdar S, Shahfahad W, Naikoo M, Mallick J, Rahman A (2023) Forest fire sus-
ceptibility mapping with sensitivity and uncertainty analysis using machine learning and deep learning
algorithms. Adv Space Res 72(2). https://doi.org/10.1016/j.asr.2023.03.026
Sachdeva S, Bhatia T, Verma AK (2018) GIS-based evolutionary optimized gradient boosted decision trees
for forest fire susceptibility mapping. Nat Hazards 92(3). https://doi.org/10.1007/s11069-018-3256-5
Sgroi F (2020) Forest resources and sustainable tourism, a combination for the resilience of the landscape and
development of mountain areas. Sci Total Environ 736. ​h​t​t​p​s​:​​/​/​d​o​i​​.​o​r​g​/​1​​0​.​1​0​​1​6​/​j​.​​s​c​i​t​o​​t​e​n​v​.​2​​0​2​0​.​​1​3​9​5​3​9
Shahfahad, Talukdar S, Das T, Naikoo MW, Rihan M, Rahman A (2022) Forest fire susceptibility mapping by
integrating remote sensing and machine learning algorithms. In Advances in Remote Sensing for Forest
Monitoring. https://doi.org/10.1002/9781119788157.ch9
Shmuel A, Heifetz E (2022) Global wildfire susceptibility mapping based on machine learning models. For-
ests 13(7). https://doi.org/10.3390/f13071050
Shmuel A, Heifetz E (2023) Developing novel machine-learning-based fire weather indices. Mach Learning:
Sci Technol 4(1). https://doi.org/10.1088/2632-2153/acc008
Sohag K, Gainetdinova A, Mariev O (2023) Economic growth, institutional quality and deforestation: evi-
dence from Russia. For Policy Econ 150. https://doi.org/10.1016/j.forpol.2023.102949
Succarie A, Xu Z, Wang W (2022) The variation and trends of nitrogen cycling and nitrogen isotope com-
position in tree rings: the potential for fingerprinting climate extremes and bushfires. J Soils Sediments
22(9). https://doi.org/10.1007/s11368-022-03260-6
Sulova A, Arsanjani JJ (2021) Exploratory analysis of driving force of wildfires in Australia: an application of
machine learning within Google Earth engine. Remote Sens 13(1). https://doi.org/10.3390/rs13010010
Talukdar S, Shahfahad, Bera S, Naikoo MW, Ramana GV, Mallik S, Kumar PA, Rahman A (2024) Optimisa-
tion and interpretation of machine and deep learning models for improved water quality management in
Lake Loktak. J Environ Manage 351. ​h​t​t​p​s​:​​/​/​d​o​i​​.​o​r​g​/​1​​0​.​1​0​​1​6​/​j​.​​j​e​n​v​m​​a​n​.​2​0​2​​3​.​1​1​​9​8​6​6
TEGNA. (2020 January 10) Estimated 1.25 billion animals killed in Australian bushfires| 10tv.com. 10tv. ​h​t​
t​p​s​:​​/​/​w​w​w​​.​1​0​t​v​.​​c​o​m​/​​a​r​t​i​c​​l​e​/​n​e​​w​s​/​e​s​t​​i​m​a​t​​e​d​-​1​2​​5​-​b​i​l​​l​i​o​n​-​a​​n​i​m​a​​l​s​-​k​i​​l​l​e​d​-​​a​u​s​t​r​a​​l​i​a​n​​-​b​u​s​h​​f​i​r​e​s​​-​2​0​2​0​-​​j​a​n​/​​5​
3​0​-​a​​4​9​3​1​9​​7​0​-​0​2​4​​9​-​4​9​​2​1​-​9​d​0​8​-​d​1​b​e​f​d​9​6​2​e​a​b
Thi Hang H, Mallick J, Alqadhi S, Bindajam AA, Abdo HG (2024) Exploring forest fire susceptibility and
management strategies in Western Himalaya: integrating ensemble machine learning and explainable
AI for accurate prediction and comprehensive analysis. Environ Technol Innov 35:103655. ​h​t​t​p​s​:​/​/​d​o​i​.​
o​r​g​/​1​0​.​1​0​1​6​/​J​.​E​T​I​.​2​0​2​4​.​1​0​3​6​5​5​​​​
Tran TTK, Janizadeh S, Bateni SM, Jun C, Kim D, Trauernicht C, Rezaie F, Giambelluca TW, Panahi M
(2024) Improving the prediction of wildfire susceptibility on Hawaiʻi Island, Hawaiʻi, using explain-
able hybrid machine learning models. J Environ Manage 351. ​h​t​t​p​s​:​​/​/​d​o​i​​.​o​r​g​/​1​​0​.​1​0​​1​6​/​J​.​​J​E​N​V​M​​A​N​.​2​
0​2​​3​.​1​1​​9​7​2​4
Tuyen TT, Jaafari A, Yen HPH, Nguyen-Thoi T, Phong T, Van, Nguyen HD, Van Le H, Phuong TTM, Nguyen
SH, Prakash I, Pham BT (2021) Mapping forest fire susceptibility using spatially explicit ensemble
models based on the locally weighted learning algorithm. Ecol Inform 63. ​h​t​t​p​s​:​/​/​d​o​i​.​o​r​g​/​1​0​.​1​0​1​6​/​j​.​e​c​
o​i​n​f​.​2​0​2​1​.​1​0​1​2​9​2​​​​
Wang T, Bian Y, Zhang Y, Hou X (2023) Classification of earthquakes, explosions and mining-induced earth-
quakes based on XGBoost algorithm. Comput Geosci 170. https://doi.org/10.1016/j.cageo.2022.105242

13
Natural Hazards (2025) 121:15331–15357 15357

Weber D, Moskwa E, Robinson GM, Bardsley DK, Arnold J, Davenport MA (2019) Are we ready for bush-
fire? Perceptions of residents, landowners and fire authorities on Lower Eyre Peninsula, South Austra-
lia. Geoforum, 107. ​h​t​t​p​s​:​​/​/​d​o​i​​.​o​r​g​/​1​​0​.​1​0​​1​6​/​j​.​​g​e​o​f​o​​r​u​m​.​2​0​​1​9​.​1​​0​.​0​0​6
Zhang Y, Lim S, Sharples JJ (2016) Modelling Spatial patterns of wildfire occurrence in South-Eastern Aus-
tralia. Geomatics Nat Hazards Risk 7(6). ​h​t​t​p​s​:​​​/​​/​d​o​​i​.​o​r​​g​/​​1​0​.​​1​0​​8​0​/​​1​9​4​7​5​​​7​0​5​.​2​​​0​1​6​.​​1​1​5​5​5​0​1
Zhang Q, Tian Y, Chen J, Zhang X, Qi Z (2024) To ensure the safety of storage: enhancing accuracy of fire
detection in warehouses with deep learning models. Process Saf Environ Prot. ​h​t​t​p​s​:​/​/​d​o​i​.​o​r​g​/​1​0​.​1​0​1​6​
/​J​.​P​S​E​P​.​2​0​2​4​.​0​7​.​0​8​6​​​​
Zhou J, Yu W, Wei W, Yang M, Du Y (2023) Provenance and tectonic evolution of bauxite deposits in the
Tethys: perspective from random forest and logistic regression analyses. Geochem Geophys Geosyst
24(6). https://doi.org/10.1029/2022GC010745

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional affiliations.

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a
publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manu-
script version of this article is solely governed by the terms of such publishing agreement and applicable law.

Authors and Affiliations

Rufai Yusuf Zakari1 · Owais Ahmed Malik1 · Ong Wee-Hong1

Owais Ahmed Malik


[email protected]
Rufai Yusuf Zakari
[email protected]
Ong Wee-Hong
[email protected]
1
School of Digital Science, Universiti Brunei Darussalam, Jalan Tungku Link,
Gadong BE1410, Brunei Darussalam

13

You might also like