0% found this document useful (0 votes)

2 views

Improving flood forecasting using time-distributed CNN-LSTM model

The document presents a novel Time-distributed CNN-LSTM (TD-CNN-LSTM) model for improving flood forecasting in small and medium basins by effectively capturing spatiotemporal hydrological features. Experimental results demonstrate that the TD-CNN-LSTM outperforms traditional models, achieving significant reductions in prediction errors and enhancing forecasting accuracy. This research addresses limitations of existing models by integrating deep learning techniques to better predict flood occurrence and peak intensity.

Uploaded by

Haider Malik

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Improving flood forecasting using time-distributed CNN-LSTM model

Uploaded by

Haider Malik

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

Earth Science Informatics

https://doi.org/10.1007/s12145-024-01354-y

RESEARCH

Improving flood forecasting using time-distributed CNN-LSTM model:

a time-distributed spatiotemporal method
Haider Malik1,2,3 · Jun Feng1,2 · Pingping Shao1,2 · Zaid Ameen Abduljabbar4

Received: 20 March 2024 / Accepted: 30 May 2024

Abstract
The rapid and devastating nature of flood events in small and medium basins presents considerable challenges to flood forecast-
ing. Developing a robust and accurate flood forecasting method is crucial to mitigating flood effects. To this end, we propose
the Time-distributed CNN-LSTM model (TD-CNN-LSTM), a hybrid deep learning framework based on a Time Distribution
layer (TD), Convolutional Neural Networks (CNNs), and Long-Short Term Memory Networks (LSTMs). The TD-CNN-LSTM
model efficiently captures spatiotemporal hydrological features within spatial dimensions using the Time-distributed local Fea-
tures extractor, while the LSTM focuses on extracting spatiotemporal features in temporal dimensions. Our model effectively
captures complex spatial and temporal relationship patterns within hydrological data from flood time series. Experimental
results on the Tunxi and Changhua basins show the superior predictive capabilities of our model, particularly in forecasting
flood occurrence time and peak. When compared with baseline models LSTM, CNN, ConvLSTM, STA-LSTM, and CNN-
LSTM at the moment T + 9, TD-CNN-LSTM achieved a 6.7%, 10.14%, 8.5%, 6.3%, and 6.6% decreased in Root-Mean-Square
Error (RMSE), respectively, a 6.5%, 9.8%, 10.9%, 6.8%, and 10.6% decreased in Mean Absolute Error (MAE), respectively,
and a 7.4%, 31.6%, 23.5%, 18.8%, and 27.8% decreased in Mean Absolute Percentage Error (MAPE), respectively. In addition,
the determination Coefficient (R2) increased by 3.6% and Nash-Sutcliffe Efficiency (NSE) increased by 4.8% .

Keywords Flood forecasting · Deep learning · Time-distributed CNN-LSTM · Spatiotemporal feature learning

Introduction

Communicated by Hassan Babaie. In China, nearly 9000 rivers are classified as rivers in small
and medium basins (Zhu et al. 2020). Due to climate fac-
* Jun Feng tors such as heavy rainfall, frequent flood disasters in these
[email protected]
basins cause casualties and property losses (Raj et al. 2021;
Haider Malik Walczykiewicz and Skonieczna 2020). Therefore, it is cru-
[email protected]
cial to mitigate the impact of floods by accurately predicting
Pingping Shao upcoming floods and quickly evacuating people and prop-
[email protected]
erty in flood-prone areas before floods transpire (Boutaghane
Zaid Ameen Abduljabbar et al. 2022; Flack et al. 2019). Despite the researchers world-
[email protected]
wide have made significant progress in reducing the impact
1
Key Laboratory of Water Big Data Technology of Ministry of floods. However, current flood prediction methods have
of Water Resources, Hohai University, Nanjing 211100, the following limitations, especially in small and medium
China basins: (1) A current deep-learning model suffers from a
2
College of Computer Sciences and Software Engineering, lack of physical interpretability of the basin itself (Shao et al.
Hohai University, Nanjing 211100, China 2024). (2) There are many different variables to consider in
3
Department of Information Technology, Management the flooding process, each with different spatial and tem-
Technical College, Southern Technical University, poral patterns (Cho et al. 2022). (3) For strong parameter
Basrah 61003, Iraq
fitting, supervised learning approaches required a significant
4
College of Education for Pure Sciences, University amount of flood time-series data (Hamdoun et al. 2021).
of Basrah, Basrah 61004, Iraq

Vol.:(0123456789)
Earth Science Informatics

Therefore, accurately predicting flood occurrence time and accurately predict long-term streamflow in the Karkheh River
peak in small and medium basins is extremely challenging. basin, which is northeast of the Persian Gulf. However, trans-
Methods for flood prediction mainly focus on two kinds of former-based methods suffer from quadratic computational
models: traditional hydrological models (Pierini et al. 2014; complexity O ( n2 ) (Li et al. 2024) because their attention
Zhou et al. 2022) and data-driven models (Ding et al. 2020; mechanism can’t handle large-scale spatiotemporal informa-
Lim et al. 2022). Traditional hydrological models simulate tion in hydrological data. In addition, (Dehghani et al. 2023)
flood processes by analyzing various hydrological character- examined the usefulness of three deep learning approaches
istics during the flooding process and using physical equa- — CNN, ConvLSTM and LSTM — for the prediction of
tions. Previously, they were widely used for flood prediction. short-term streamflow in Malaysia’s Muda and Kelantan river
However, due to the unique features of each river’s watershed basins, the hourly streamflow and rainfall data were employed
necessitating the creation of a unique model for each river, so to assess the accuracy of the methods across 1, 3, and 6 h
there is a lack of adaptability for these models and a deficit forecast horizons. However, these models struggle to capture
of versatility (Cho et al. 2022). The advantage of data-driven spatiotemporal features as the prediction horizon widens. This
models over traditional hydrological models has been dis- reduces the accuracy of the model’s ability to predict future
cussed in numerous studies (Gharbia et al. 2022; Kan et al. events. The main reason for this is the confluence time (the
2016; Parisouj et al. 2022; Yang and Chui 2019). Data-driven delay between rainfall and downstream flow) being far away
models mainly establish mapping relationships by observing from the last hours of the model’s prediction horizon.
data and predicting specific hydrological variables (Ding et al. Therefore, to overcome these limitations, we propose a
2020). Unlike hydrological models, data-driven models can flood forecasting hybrid deep learning framework based on
directly map the correlation relationships between input and a Time Distribution layer (TD), Convolutional Neural Net-
output variables without describing the basin’s physical geo- works (CNNs), and Long-Short Term Memory Networks
morphology features (Lim et al. 2022). Complex methodolo- (LSTMs). Streamflow and rainfall data from two Chinese
gies of data-driven flood models have been developed with the basins, i.e., Changhua and Tunxi, are utilized to train and
elaboration of computing power and algorithms. These include assess the performance of the proposed model, respectively.
a range of machine learning techniques, such as support vec- We compare and analyze each model’s accuracy through sta-
tor regression (SVR) (Bafitlhile and Li 2019; De Gregorio tistical indicators widely used in hydrology and visual analy-
et al. 2018; Kisi 2015; Li et al. 2016) and other traditional sis of the comparison between forecasted and ground truth
machine learning techniques (Rahman et al. 2023; Zucco datasets. Compared with the baseline models, our proposed
et al. 2015). Additionally, deep learning techniques, such as TD-CNN-LSTM performed better in most. Additionally,
Recurrent Neural Networks (RNNs) (Chang et al. 2014; Chen we conduct an initial analysis of the basin’s spatiotemporal
et al. 2013), Long-Short Term Memory networks (LSTMs) pattern weights to interpret our proposed model’s physical
(Cheng et al. 2015; de la Fuente et al. 2019; Ding et al. 2020; validity. The main contributions of this paper are as follows:
Gude et al. 2020; Hu et al. 2019; Liu et al. 2022; Moishin et al.
2021; Ruma et al. 2023; Won et al. 2022), and Convolutional 1) Compared with this research (Mirzaei et al. 2021), this
Neural Networks (CNNs) (Shu et al. 2021; Wang et al. 2021). study considers extracting local spatiotemporal features
Compared with machine learning techniques, deep learning within short temporal dependencies of input timesteps
techniques often demonstrate superior performance and the (spatial dimensions) to facilitate the learning of global spa-
capability to interpret complex patterns within hydrological tiotemporal features over long temporal dependencies (tem-
time series data (Farahmand et al. 2023; Salman et al. 2015; poral dimensions) between successive input timestamps.
Sankaranarayanan et al. 2020), which makes them excellent for 2) Compared to previous research (Ghobadi and Kang
the difficulties associated with flood prediction. For example, 2022; Ding et al. 2020), this study effectively sidesteps
(Mirzaei et al. 2021) introduce a novel Stacked Long Short- the computational bottleneck of quadratic complexity O
Term Memory (SLSTM) model for daily-runoff simulation. ( n2 ) associated with utilizing the attention mechanism,
This study provides insights into applying machine learning thereby enabling efficient handling of the large-scale
technology in hydrological modeling for improved streamflow spatiotemporal information in hydrological data.
simulation accuracy. However, this model requires input sam- 3) Compared with this research (Dehghani et al. 2023), this
ples containing thousands of timestamps. Therefore, a single study has successfully precisely forecasted flood timing
timestamp may not offer sufficient discriminative information and peak intensity within a 9-hour horizon widens using
for accurately learning spatiotemporal hydrological features. a deep CNN with LSTM, and The CNN architecture’s
Moreover, this model may struggle to effectively process the layers were wrapped with a time-distributed technique.
three-dimensional spatiotemporal information ingrained in 4) Compared to some studies (Cho et al. 2022; Sankarana-
hydrological data. Furthermore, (Ghobadi and Kang 2022) rayanan et al. 2020), this study considered the hydrologi-
showed that the TD–CNN–transformer combination can cal characteristics of the streamflow forecast area (near
Earth Science Informatics

the Tunxi or Changhua rivers) and the influence of the Toulouse with a 6-hour lead time. However, these approaches
hydrological characteristics of the surrounding area (dis- still have limitations, such as the inability to achieve the highest
tant areas) on the streamflow forecast area. In addition, levels of accuracy, which makes it necessary to look into alter-
a verification scheme for interpreting the weights of the nate techniques. As an alternative, recent studies successfully
local and global factors’ correlation patterns between the utilized the DL techniques. For example, Feng et al. (2022) intro-
time steps of hydrological time series data is proposed. duced a novel network termed the Graph Convolution Spatial-
Temporal Attention LSTM (AGCLSTM) to solve time series
The rest of this paper is organized as follows: Sect. 2 prediction issues such as flooding prediction. However, this
reviews related work. Section 3 describes the algorithms model is unable to provide convergence assurance and strug-
used in this study, including the CNN and LSTM, and the gles to understand sample distributions. Chen et al. (2022) pre-
proposed TD-CNN-LSTM model. Section 4 summarizes the sented the ConvLSTM approach for extracting spatiotemporal
datasets, pre-processing, baseline and parameter settings, hydrological features and used the hydrological data from the
and evaluation metrics. Sections 5 and 6 present results and Xi County stations in Henan Province in China. However, they
discussions, respectively. Finally, Sect. 7 presents the con- studied small counties with relatively uniform rainfall and runoff
clusions and future work. patterns.
Therefore, it is essential to persist in investigating and
developing deep learning methods explicitly designed for
Related work flood complexity in small and medium basins.

In this section, we review relevant research that inspired the

design of the TD-CNN-LSTM model. This mainly includes
flood forecasting models, a time distributed-based model. Time‑distributed technique‑based methods

Floods forecasting models Time-distributed layers have attracted significant atten-

tion in many fields due to their advantages in parallel
Generally, hydrological models can be classified into concep- processing. When integrated with deep neural networks,
tual and physically-based models, according to their descrip- time-distributed layers can process multiple time steps
tion of the physical processes of a flood, i.e., confluence time, simultaneously in parallel, which increments their abil-
that consider physical laws in a simplified manner (Ibrahim and ity to extract complex spatiotemporal features from input
Dan’azumi 2020). For example, Sm and Kp (2023) investigated data. For example, Slimi et al. (2022) proposed a novel
the potential of employing a one-dimensional (1-D) hydrody- hybrid model (Time Distributed CNN + Transformer) for
namic model, specifically the HEC-RAS model, to simulate the a speech emotion recognition system. Qiao et al. (2018)
extent, depth, and velocity of flood inundation and prediction introduced a hybrid Time-distributed ConvLSTM model
of flood dangers. Grimaldi et al. (2019) predicted floodplain for machine health monitoring. This model achieved the
inundation using coupled hydrological-hydraulic models in the best performance in both regression and classification pre-
Murray Darling Basin (Australia). However, these models suf- diction tasks. Montaha et al. (2022) proposed a hybrid
fer from the following limitations: (1) The parameter estima- Time-distributed CNN-LSTM model for brain tumor diag-
tion process of these models is relatively complex, relying on nostics, which performed well in experimental data.
manually defined methods, which increases the uncertainty of Inspired by the above works, we propose a TD-CNN-
the model’s prediction (Dang and Anh 2023). (2) The input data LSTM model that extracts the local and global correla-
of these models are challenging to construct, and the prediction tion patterns between time steps of flood time series data
accuracy is low due to nonlinearity compared to the data-driven to accurately forecast flood time and peaks in small and
model (Zhang et al. 2018). To circumvent these limitations, medium basins. The next section presents the proposed
data-driven models have been developed for flood forecasting model in more detail.
and time series analysis. These models are fueled by machine
learning (ML) and deep learning (DL) techniques. In terms of
traditional ML, Dwiasnati and Devianto (2022) developed a Materials and methods
flood prediction method by employing a support vector machine
(SVM) model, integrating particle swarm optimization (PSO) to Convolutional Neural Network (CNN)
improve the accuracy of the SVM algorithm. Defontaine et al.
(2023) constructed various machine learning algorithms, such CNNs were first proposed as a solution to computer vision
as gradient boosting regressor and linear regression. They used problems. It was developed in the early 1990s by LeCun
hourly information from upstream stations to predict floods in et al. (1989) for recognizing handwritten digits. When
Earth Science Informatics

processing sequential data like time series, the performance into the LSTM architecture, which enhanced the LSTM’s
of 1D-CNN is noteworthy, as its application to multivariate ability to retain information for extended periods. Within an
time series data facilitates automatic extraction and learn- LSTM network, each cell has two distinct states: the hidden
ing of features (Garg and Krishnamurthi 2023). A standard state h(t) and the cell state Ct. The LSTM network was organ-
1D-CNN model comprises distinct layers: input, convolu- ized as forget ft, input it, and output Ot gates. In addition, the
tional, and output layers. The convolutional layer operates sigmoid function (S) is used to begin calculation by consider-
on the concepts of weight sharing and sliding windows, ing the input from the recent time step (x(t) and the prior hid-
serving as a pivotal element for feature extraction from the den state value (h(t-1). As shown in Fig. 1, LSTM structure
input data. With this layer, the kernel filter extracts feature shows the gates mechanism (red dote rectangle). Details of
from input data. Then, the max pooling layer condenses the LSTM operations are described in formulas (2)–(7). W and b
feature map size, thereby reducing inter-layer connections denote weight matrices and bias vectors, respectively. ⦿ indi-
and enabling separate processing per feature map (Özdemir cates element-wise dot products. Forget gate ft assesses the
2023). Details of 1D-CNN mathematical expression is information that must be included or forgotten within the cell
described in formulas (1) (Ji et al. 2012). state Ct. Input gate it selects the new information that should be
( H −1 D −1 ) integrated into the cell state. Finally, output gates Ot determine
∑∑ i ∑ i
which information to let out, filtered input Ot and processed
xy hd (x+h)(y+d)
𝛾(ij) = ReLU wijm 𝛾(i−1)m + b(ij) (1)
m h=0 d=0
cell state Ct.
( [ ] )
where 𝛾(ij) represents
xy
the output value at position (x,y) in the ft = S Wf ⋅ h(t−1) , x(t) + bf (2)
jth feature map of ith layer, ReLU is the activation function,
( [ ] )
b(ij) is feature map bias, whd
ijm
is the kernel value at position ( it = S Wi⋅ h(t−1) , x(t) + bi (3)
h, d ) connected to the kth feature map, Di , Hi are the width
and height of kernel, respectively. ( [ ] )
Ot = S Wo⋅ h(t−1) , x(t) + bo (4)
Long short term memory (LSTM) ( [ ] )
̂ t = S Wc⋅ h(t−1) , x(t) + bc
C (5)
In their landmark work, Schmidhuber and Hochreiter (1997)
introduced the concept of LSTM, a type of Recurrent Neu- ( ) ( )
ral Networks (RNNs) designed to address the challenges of �t
Ct = C(t−1) ⊙ ft + it ⊙ C (6)
vanishing gradients and long-term temporal dependence. This
innovation incorporated memory cells and gating mechanisms

Fig. 1 LSTM structure shows

the gates mechanism (red
dote rectangle) and internal
processes applied to address Previous Cell
S Sigmoid Activation function
T Tanh Activation function
New Cell
the challenges of vanishing gra- State State
dients and long-term temporal X +
dependence
Input
Gate
Output T
Gate
Forget
Gate X X

Previous Hidden S S Cell

Update T S New Hidden
State
State

X Pointwise Multiplication + Pointwise Addition

input
Earth Science Informatics

( )
ht = Ot ⊙ T Ct (7) from input flood time series data since they always con-
tain thousands of timestamps. Therefore, initially extract-
ing time-distributed local features of the input data in the
The proposed hybrid model spatial dimension before extracting holistic features in the
temporal dimension can make it easier to learn global fea-
In this section, a time-distributed CNN-LSTM (TD-CNN- tures between timestamps of hydrological time series data.
LSTM) is presented for short-term flood prediction. This This process involves feeding the hydrological time series
model merged CNN and LSTM architectures, with a time- data into the time-distributed layer, which applies a layer
distributed layer wraps the CNN layers to empower paral- for every temporal slice of the CNN input to extract local
lelized processing of hydrological time series data. Figure 2 features in the spatial dimension. Then, the LSTM model
depicts the schematic representation of this model. This extracts global features of the basin’s characteristics in the
framework focuses on learning features, considering spa- temporal dimension, as shown in Fig. 3. By combining these
tial and temporal information across time-distributed layers different features, the proposed model can better understand
in spatial dimensions. Even though the basic CNN-LSTM the hydrological characteristics, which helps us predict
can directly analyze flood time series data to capture the floods accurately.
spatiotemporal features’ relationship patterns, it is challeng-
ing to learn long temporal dependencies features accurately
Time‑distributed local feature extraction

Time-distributed local feature extraction is an essential com-

ponent of our proposed TD-CNN-LSTM model, inspired by
Input Data
Qiao et al. (2018). This approach involves wrapping CNN
layers with a time-distributed technique called TD-CNN,
which enables simultaneous parallel processing of input time
steps hydrological data. By implementing this technique,
CNN layers can operate simultaneously, processing each
Rainfall Streamflow
subsequence of time steps independently and concurrently,
1
as shown in Fig. 4. The input data [is divided into N] local
Time Distributed Layer
subsequences represented as X = ST1 , ST2 , … , STN , with
{ }
each subsequences denoted as STi = xT1 , xT1 , ..., xTk , where
Sequence 1 Sequence 2 Sequence k i i i
Local Feature xTk is the kth timestep in the subsequence. Since the TD-
extraction i
CNN model extracts local features uniformly across its

Local features Local features Local features N time steps

CNN LSTM
2
Flow Station

Rainfall Station

global features
extraction

Features
global features global features global features in temporal dimension

Layer out

Features
in spatial dimension Time
1×n

Fig. 3 The architecture of spatial and temporal dimension in hydro-

Fig. 2 The framework of the proposed TD-CNN-LSTM hybrid model logical time series data
Earth Science Informatics

Local features y Flow Station Rainfall Station

CNN CNN CNN

Features
t=1 t=2 t=k Time step x
Flow Position

(a) (b)

Fig. 5 The figure’s left side (a) shows an abstracted form of basins
with spatial grids. The input hydrological factors are shown in the
Fig. 4 Structure of Time Distributed -CNN model right (b)

structure, we focus on analyzing individual local feature by the notation (s, t, f), where s represents the number of
extractors within each input subsequence. Thereby, STi is samples, t denotes the number of timestamps, and f indicates
divided into k slices, and each slice fp , (p = 1, 2, ..., k) serves the number of features. Each input sampling must precisely
as input for the CNN layers. In the TD-CNN architecture, match the shape [number of timestamps, number of char-
two 1D convolution operations are applied to each slice fp acteristics]. The output of the first LSTM layer is then fed
simultaneously using a time-distributed layer. These two into the input of the second layer above it, resulting in the
convolution layers are connected with ReLU activation func- extraction of final global spatial and temporal features. To
tions to enhance the information extracted. This method accomplish this, activating the return sequences parameter
extracts relevant local spatiotemporal correlation features on the preceding layer of the LSTM is essential. This acti-
from flood time series data in spatial dimensions. The fea- vation allows the output of the initial layer to be transferred
tures are then subjected to a max pooling layer to condense to the next layer. Finally, the final prediction p is generated
across sequences. After that, these structures were flattened by the output layer without any activation function applied.
down to a one-dimensional vector, serving as input time
steps of the LSTM layers (step 2). Furthermore, integrating
the time-distributed layer into the CNN model introduces an Interpretation
additional dimension to the input shape. This allows the
CNN model to process multiple subsequences as a single The principle of the TD-CNN-LSTM model and the model
input. In our experiment, the input dimension is represented architecture were previously presented. In this section, a
as (s, n, t, f) where s denotes the samples, n is the number of verification scheme is proposed to interpret the weights of
subsequences, t signifies the total time steps in each subse- the local and global features of the model.
quence, and f is the total features of sequences. Figure 5 shows the characterization of specific basins
and data structures. Figure 5(a) illustrates spatial grids in an
abstract form, while Fig. 5(b) represents the input of hydro-
Global features extraction logical factors. Each factor is defined as a 3D dimensional
vector encompassing hydrological features, position features,
The global feature extractor follows the local feature extrac- and time features. There are significant location differences
tion method outlined in step (1) The N time steps contain- in terms of the informativeness of input hydrological factors.
ing local feature sequences serve as input for global feature For example, areas near the Tunxi or Changhua rivers should
extraction in step (2) The global features extraction consists hold greater importance than distant areas, given that rainfall
of two LSTM layers stacked over each other. This phase near rivers tends to gather, quickly increasing runoff values.
involves using two vertically stacked LSTM layers to extract Usually, there is a delay (confluence time) between rainfall
global features in the temporal dimension of hydrological and downstream flow. Utilizing these average confluence
data. Each layer specializes in extracting features at different times for the entire basin streamlines calculations in small
time steps. These layers are connected with ReLU activa- or medium-sized basins.
tion functions to enhance the information extracted from Theoretically, the weights associated with the local
hydrological input data, as shown in Fig. 2. The 3-dimen- and global features between time steps are expected to
sional input shape used by the LSTM model is represented change corresponding to input data changes of rainfall
Earth Science Informatics

and streamflow. For specific basins, the variations in the The TD-CNN model extracts spatial
( correlation
) patterns
weights of local and global correlation patterns should at specific downstream flow q xd , yd , t + 1 , while the
align with those of the confluence process. Therefore, the LSTM model extracts temporal correlation patterns in past
confluence process of rainfall and streamflow rate relies moments, as seen in Fig. 6.
on weights of local and global correlation patterns within
input data as follows:
( ) ( ) ( ( ) ) Experiments
q xd , yd , t + 1 ∝ q xd , yd , t + a |n−1
a=0
&p x, y, t − △t − b |n−△t
b=0
(8) Datasets
( The time )step 1 (T + 1) denotes downstream flow
q xd , yd , t + 1 , which is associated with (prediction value
) The TD-CNN-LSTM model for accurate flood prediction
p(x, y, (t − 𝛥t) − b)|n−𝛥t
b=0
and ground truth q xd , yd , t + a |n−1
a=0
was tested in the Tunxi and Changhua basins. The input
as shown in Eq. (8). The focus of short and long-term cor- data for this model comprises streamflow and rainfall val-
relation patterns should be interpreted over time, while ues for each basin over 12 h (including current), while
the 𝛥t is associated with the confluence process time tc . the output is the hourly streamflow value at the watershed
outlet for the next 9 h (each hour).
The Tunxi watershed is located in Anhui province in the
eastern region of China. Tunxi contains one streamflow
streamflow global temporal focus
station and eleven rainfall stations, as shown in Fig. 7a.
The dataset’s period is from June 1981 to July 2003. The
Changhua watershed is located in the northwest mountain
Δ region of Zhejiang province of China. Changhua con-
rainfall
tains seven rainfall stations and one streamflow station, as
shown in Fig. 7b. The dataset’s period is from April 1998
to July 2010. Additional details regarding the datasets can
0 +m be found in Table 1.

Data pre‑processing
Fig. 6 Description of 𝛥t and tc. The X-coordinate in Figure (a) repre-
sents time, while the Y-coordinate represents streamflow forecasting.
The X-coordinate in Figure (b) represents time, while the Y-coordi- Increasing the model’s accuracy and robustness by
nate represents streamflow forecasting enhancing the data’s quality is crucial. Our experiment

Fig. 7 The Tunxi Basin is shown in the left side (a), while the Changhua Basin is shown in the right side (b)
Earth Science Informatics

Table 1 Dataset’s Characteristics

Basin Splitting strategy Mean Variance Standard Median Min Max Features Time

Dataset Total Training Testing Deviation

Tunxi 43,435 39,092 4343 224.53 207,511 455.534 84.25 6.28 6490 Streamflow, Rainfall 1h
Changhua 9371 8434 937 146.65 41006.57 202.50 80.32 0.57 2100 Streamflow, Rainfall 1h

drew datasets from two distinct sources with various effectively, while the LSTM model captures longer-term
parameter units and scales. These dissimilarities might historical flood features accurately.
impact the learning performances of the TD-CNN-LSTM Parameter settings of TD-CNN-LSTM: compared with
model. Therefore, The Min-Max normalization method CNN-LSTM, the model proposed performs better at captur-
was adopted. The following equation gives specific steps ing spatiotemporal information of hydrological data. The input
on how to convert the actual value into a normalized value: multivariate flood data was split into input/output samples
using the sliding windows method, with 12-time steps as the
X − min(x)
�
Xnorm = (9) input and 9-time steps as output. Then, each input sample
max(x) − min(x) was divided into two subsamples with 6-time steps by a time-
the normalized value is represented by Xnorm
′
whereas distributed layer. For all the datasets, 90% were used for train-
the actual value is represented by X . The minimum and ing and 10% for testing. The hyperparameter settings can be
maximum values of x are denoted by min(x) and max(x), fine-tuned in various ways for the TD-CNN-LSTM model. In
respectively. our experiment, the models were trained with various hyper-
parameter combinations, and these combinations were manu-
ally adjusted until the most successful collection of hyperpa-
Baselines and parameter settings rameters was found (Liu et al. 2022). The learning rate was set
to 0.001. The Adam optimizer was used and the batch size was
In our experiment, we compared and analyzed five mod- set to 90. For step 1, the two convolution filters of CNN were
els that have previously been used in the area of flood set to 64, kernel size and Maxpooling were set to (1,1), and
prediction: a Relu activation function was used. For Step 2, two LSTM
LSTM (Won et al. 2022): In this study, the LSTM model layers were stacked, each with 100 units and a Relu activation
was introduced. It stacked the LSTM layers in sequence function. In addition, the early stopping technique was used
to capture intricate patterns, thereby improving prediction during our experiment’s neural network training process. This
accuracy and enabling it to address complex long-term technique reduces the need for computational resources and
relationships of urban flood. long training time by ending the training process as soon as it
CNN (Wang et al. 2021): Through the recognition of is apparent that the model is not progressing on the validation
spatial patterns, convolutional layers are employed to cap- dataset (Vilares Ferro et al. 2023).
ture crucial features from the hydraulic time series data,
facilitating efficient feature learning and enhancing the
precision of flood prediction model. Evaluation metrics
ConvLSTM (Moishin et al. 2021): Develop a hybrid
deep learning algorithm (ConvLSTM) that integrates the In this study, the accuracy and robustness of the TD-CNN-
advantages of convolutional and LSTM networks to inter- LSTM model in terms of flood peak and streamflow were
pret complex spatiotemporal patterns in sequential data evaluated using several evaluation indicators, including Root
and forecast flood events in the future. Mean Square Error (RMSE), Determination Coefficient ( R2),
STA-LSTM (Ding et al. 2020): Add an attention mecha- Mean Absolute Error (MAE), Mean Absolute Percentage
nism within the LSTM architecture; the STA-LSTM model Error (MAPE), and Nash-Sutcliffe Efficiency (NSE), as
enhances its understanding of complex spatial and tempo- follows:
ral associations in flood predictions. √
CNN-LSTM (Pareek et al. 2023): a hybrid model based ∑n (pi − ai )2
RMSE = (10)
on merging CNN and LSTM to enhance the precision of i=1 n
flood forecasting. The CNN model extracts flood features
Earth Science Informatics

Table 2 Average Performance for LSTM, CNN, ConvLSTM, STA-LSTM, CNN-LSTM, and TD-CNN-LSTM models based on evaluation metrics for assessing forecasted hourly streamflow
∑n − 2 − 2
i=1
(pi − p) (ai − a)
R2 =
− 2
(11)
∑n ∑n − 2
i=1
(pi − p) i=1
(ai − a)

∑n i
i=1 �p − ai �
(12)

MAPE
MAE =

0.27
0.25
0.23

0.25

0.27

0.23
n

1 ∑n |pi − ai |
MAPE = × 100% (13)

NSE
n i=1 ai

0.82
0.84
0.85

0.84

0.85

0.84
� ∑n �
i i 2
i=1 (p − a )
NSE = 1 − (14)

23.09
22.38
23.38

23.24

23.80

20.14
∑n

MAE
−
i=1
(ai − a)2

where pi denotes the forecast streamflow value with applied

Changhua
models (LSTM, CNN, ConvLSTM, STA-LSTM,− CNN-

0.84
0.82
0.83

0.84

0.85

0.85
−
LSTM) while ai is the actual streamflow. The a and p denote

R2
the average of the actual and forecast values, respectively,
and n denotes the total test values. The R2 value indicates a
robust positive relationship between forecasted and actual
streamflow; the positive value close to 1 reflects a strong
correlation. The NSE measures the residual variance com-
pared to the variance of the actual streamflow. A value of

RMSE

48.56
49.44
48.16

46.64

46.57

45.21
1 indicates perfect model performance, and values closer
to 1 indicate better performance. The optimal value of the
RMSE, MAE, and MAPE metrics is 0; These metrics can
range from 0 to ∞.

Results
MAPE

0.19
0.10
0.12

0.14

0.16

0.09
Model performance comparison
compared to observed hourly streamflow at Tunxi and Changhua basins

This study compared the accuracy and capability of LSTM,

NSE

0.94
0.94
0.94

0.94

0.95
CNN, ConvLSTM, STA-LSTM, CNN-LSTM, and TD-
CNN-LSTM models, for hourly streamflow prediction across
the Tunxi and Changhua basins. Table 2 presents the average
27.79
21.12
22.71

24.53

24.52

20.29
MAE

performance of statistical metrics for the 9-hour streamflow

prediction task, comparing the predicted streamflow by these
models against observed streamflow at the Tunxi and Chang-
hua stations. On all datasets, our TD-CNN-LSTM model
performs better than baselines in most cases. Specifically, in
Tunxi stations, the TD-CNN-LSTM model demonstrates sig-
nificant improvements in RMSE and MAE values compared
Tunxi

0.94
0.94
0.94

0.94

0.95

to LSTM, CNN, ConvLSTM, STA-LSTM, and CNN-LSTM.

2
R

The decreases in RMSE range from 9 to 15.1%, while the

decreases in MAE range from 4 to 26.9%. Similarly, in the
RMSE

80.44
77.26
76.22

78.02

75.40

68.29

Changhua stations, the TD-CNN-LSTM model achieves

lower RMSE and MAE values compared to the baseline
models. The decreases in RMSE range from 2.9 to 8.6%,
TD-CNN-
LATM

LSTM

LSTM
ConvL-
Models

STM

while the decreases in MAE range from 10 to 15.4%. These

LSTM

CNN-
STA-
CNN

findings highlight the TD-CNN-LSTM model’s superior

Earth Science Informatics

predictive capabilities in forecasting watershed streamflow. Tunxi data allows the TD-CNN-LSTM to perform better due
This model’s effectiveness can be attributed to its use of a to the larger size of dataset. Therefore, the proposed model
time-distributed layer, which allows the CNN component to can perform well in short-term forecast tasks, especially
process time steps in parallel and eliminates input data noise. with larger data sets.
As a result, the TD-CNN model converts input data into Based on the results presented in Figs. 10 and 11, the TD-
shorter sequences, enhancing the LSTM’s ability to learn CNN-LSTM model, which employs time-distributed layers,
long temporal dependencies. exhibits impressive performance across Tunxi and Chang-
Figure 8 displays the RMSE results of the models’ predic- hua basins. MAE and MAPE metrics are used to measure
tions, the lighter shades indicating lower RMSE values and the error between the forecasted value and the ground truth
higher accuracy in predicting streamflow. Across the Tunxi value. Smaller values of MAPE and MAE indicate that the
data, the TD-CNN-LSTM model consistently shows supe- error between the predicted value and the actual value is
rior accuracy from 2 to 9 h. Similarly, the TD-CNN-LSTM smaller, which suggests that the predictive power of the
performs with the highest accuracy across most moments proposed model is more accurate and robust. Figure 12
in the Changhua data. Additionally, the CNN-LSTM model visualizes the NSE values of different models across the
performed better than CNN and LSTM, highlighting the Tunxi and Changhua basins for various prediction horizons
hybrid models are more precise and robust than the bench- (1 to 9 h). NSE values closer to 1 indicate higher accuracy
mark. All models performed well at T + 1, but their errors in predicting streamflow. In the Tunxi basin, as the lighter
steadily increased to varied degrees as time passed. Figure 9 shades of blue indicate, the TD-CNN-LSTM model consist-
displays the R2 of predicted streamflow from 1 to 9 h by ently achieves high NSE values across all prediction hori-
six models across the Tunxi and Changhua basins. In terms zons. Compared with all models at T + 9, the NSE values
of Tunxi, compared with all models at T + 9, the R 2 value of TD-CNN-LSTM increment to 4.8%; This shows that it
of TD-CNN-LSTM increment to 3.6%; at the same time, outperforms other models in terms of accurately predict-
the TD-CNN-LSTM model showed equivalent performance ing streamflow. In the Changhua basin, the TD-CNN-LSTM
with other models in the Changhua data. This is because the model does not consistently outperform other models in

Fig. 8 RMSE heat map of prediction from different models across Tunxi (a) and Changhua (b) basins

Fig. 9 R2 heat map of prediction from different models across Tunxi (a) and Changhua (b) basins
Earth Science Informatics

Fig. 10 MAE heat map of prediction from different models across Tunxi (a) and Changhua (b) basins

Fig. 11 MAPE heat map of prediction from different models across Tunxi (a) and Changhua (b) basins

Fig. 12 NSE heat map of prediction from different models across Tunxi (a) and Changhua (b) basins

terms of average performance, as indicated by the darker Robustness analysis

shades of blue. Based on this, the TD-CNN-LSTM model
may not perform well in this basin compared to others. How- The model based on a time-distributed technology effec-
ever, the TD-CNN-LSTM model still shows relatively high tively utilizes prior information in both spatial and tempo-
NSE values for specific prediction horizons, such as 1 and ral dimensions. Theoretically, it has a more robust feature
2 h. Therefore, in the Changhua basin, the TD-CNN-LSTM extraction ability for flood time series data than the flood
model can perform well in some instances, but it may not prediction models. In this section, we compared the per-
consistently surpass other models in terms of average performance of different deep learning models, including the
formance across all prediction horizons. The reason behind TD-CNN-LSTM flood prediction model, under different
this is that the TD-CNN-LSTM model is more effective with time conditions. Specifically, we evaluate the performance
the Tunxi data, as it has a larger data size. and robustness of these models under different predicted
Earth Science Informatics

horizons. To this end, we analyze a specific flood event RMSE, MAE, and MAPE, compared to LSTM, CNN, Con-
from test data (Tunxi) to examine the model’s robustness vLSTM, STA-LSTM, and CNN-LSTM, the RMSE value
in matching the ground-truth runoff with forecasted run- of TD-CNN-LSTM decreased by 15.6%, 11.8%, 13.8%,
off during the peak of flood events. Figure 13 shows the 15%, and 9.9%, respectively, the MAE value decreased by
comparison of the forecasted and ground truth values at 34.9%, 10.2%, 14.9%, 21.9%, and 27%, respectively, the
moment T + 3. Notably, we can notice that the TD-CNN- MAPE value decreased by 61.9%, 11.1%, 27.3%, 33.3%, and
LSTM model provides a better fit for the forecasted and 55.6%, respectively. This reflects the model’s strong ability
ground truth values. Compared with LSTM, CNN, ConvL- to extract hydrological features and accurately map them to
STM, STA-LSTM, and CNN-LSTM, the RMSE value of confluence time.
TD-CNN-LSTM decreased by 28.8%, 23.2%, 13%, 14.4%, Figure 15 shows the comparison of the forecasted and
and 25.8%, respectively. Furthermore, while the proposed ground truth values at moment T + 9. When consider-
model’s MAE and MAPE showed equivalent performance ing the numerical performance metrics of RMSE, MAE,
to other models, the TD-CNN-LSTM model still shows more and MAPE, relative to LSTM, CNN, ConvLSTM, STA-
accurate and robust forecasts for the flood peak’s occur- LSTM, and CNN-LSTM, the RMSE value of TD-CNN-
rence time. This enhanced performance may be because LSTM decreased by 6.7%, 10.14%, 8.5%, 6.3%, and 6.6%,
the model can extract complex spatial and temporal cor- respectively. Similarly, the MAE value of TD-CNN-LSTM
relation patterns precisely between input hydrological time decreased by 6.5%, 9.8%, 10.9%, 6.8%, and 10.6%, respec-
steps. Figure 14 shows the comparison of the forecasted and tively, while the MAPE value of TD-CNN-LSTM decreased
ground truth values at moment T + 6. During moment 6 and by 7.4%, 31.6%, 23.5%, 18.8%, and 27.8%, respectively. At
the next moments, the forecasted values for the flood peak moment T + 9, our model presents a precise capability to
show some fluctuation, demonstrating the TD-CNN-LSTM forecast flood peak timings compared to other models. Our
model’s attempt to simulate the behavior of the flood peak analysis of flood peaks at this time indicates that our pro-
at this point. Taking the numerical value performance of posed model has effectively forecasted the horizon’s extent

Fig. 13 Comparison at T + 3, considering flood peak, comparing the observed and forecasted streamflow rates calculated using the TD-CNN-
LSTM and baseline models with the Tunxi dataset
Earth Science Informatics

Fig. 14 Comparison at T + 6, considering flood peak, comparing the observed and forecasted flow rates calculated using the TD-CNN-LSTM
and baseline models with the Tunxi dataset

in contrast to other models, maintaining superiority in accu- of the residuals are close to zero. Conversely, a model with a
rately predicting when flood peaks occur. wider distribution or more residuals farther from zero indi-
The scatter density distribution plot compares forecasted cates less accurate predictions, indicating larger errors in
runoff based on the TD-CNN-LSTM model under 1–9 h prediction. The TD-CNN-LSTM model stands out because
forecast horizons, as shown in Fig. 16 across the Tunxi data. its distribution is relatively narrow and centered around
The model displayed satisfactory performance with an R 2 zero, indicating accurate predictions with minimal errors.
value close to 1, indicating good accuracy. In Fig. 16a-h, This suggests that the TD-CNN-LSTM model performs
the forecasted and ground truth runoff are well-aligned, better in accurately forecasting flood events at T + 9 than
while Fig. 16i shows a lower R2 value and a more erratic other models. Meanwhile, the key performance metrics are
distribution of forecasted runoff, indicating a decline in the shown alongside each histogram, including NSE, MAE,
model’s ability to make accurate predictions over extended RMSE, and MAPE. The TD-CNN-LSTM model has the
horizons. However, Compared with LSTM, CNN, ConvL- highest NSE value among all models. Additionally, it has
STM, STA-LSTM, and CNN-LSTM, the R 2 value of TD- low MAE, RMSE, and MAPE values compared to the other
CNN-LSTM increment by 3.6%. This can be attributed to models, indicating better overall performance in terms of
the proposed model’s use of a time-distributed layer, which error metrics.
enhances the model’s robustness in hydrological prediction
tasks. Figure 17 shows a histogram of residuals for differ- Error assessment
ent flood forecasting models for Tunxi data at T + 9. The
histograms represent the distribution of residuals, which are Despite the models having acceptable accuracy, our testing
the differences between observed and predicted values. The has revealed the presence of errors, which exhibit a clear
histograms provide a visual representation of each model’s trend of temporal variation. This variation can be attrib-
residual distribution. A model with a narrow distribution uted to several factors, including the limitations of the
centered around zero indicates accurate predictions, as most forecasting architecture and the incompleteness of input
Earth Science Informatics

Fig. 15 Comparison at T + 9, considering flood peak, comparing the observed and forecasted flow rates calculated using the TD-CNN-LSTM
and baseline models with the Tunxi dataset

data. In this section, we consider and discuss the model’s Model interpretation
performance on various datasets at different times, draw-
ing the following conclusions: In earlier subsections, we considered the performance of
various models and comprehensively analyzed the pro-
1) A distinct error temporal pattern can be noticed in each posed model’s ability to forecast flood peak occurrences
model. This indicative the inadequacy of intrinsic input based on experiment details. The conclusion can be made
features over time, resulting in a gradual loss of accu- that the TD-CNN-LSTM model shows the most optimal
racy. solution for achieving accurate flood forecasting. In this
2) It is challenging to analyze the processes that cause com- section, we analyze the interpretability of our proposed
plex floods. Our experiments only considered important model based on a time-distributed technology. In addition,
streamflow and rainfall variables as inputs, leading to the weights of the correlation spatiotemporal components
incomplete data. It is possible that the error occurred of our proposed model are visualized and mainly focus
due to incomplete input information. on their time-varying trend. We adopted a streamflow
3) Metrics such as MAPE may provide an average assess- prediction model based on the TD-CNN-LSTM model.
ment of model performance but may not capture all Local and global spatiotemporal correlation features were
aspects of predictive accuracy. Therefore, a more thor- extracted from the spatial and temporal dimension of the
ough evaluation should incorporate multiple indicators. hydrological input data. This study used a time-distributed
4) The size of the training dataset significantly impacts the CNN model to focus on the specific area that played a
performance of models. The Tunxi training dataset is the significant role in streamflow prediction. Additionally,
biggest one utilized, so trained models on it can make the LSTM model was used to capture long-term temporal
better predictions. Therefore, ensuring an extensive train- dependencies from historical hydrological data.
ing dataset is crucial for achieving accurate predictions.
Earth Science Informatics

Fig. 16 Scatter plots show the runoff- forecasted results generated by 2 value
forecasted hours closely match the 1:1 line. Furthermore, the R
the TD-CNN-LSTM model compared to the ground truth data, span- for the runoff forecasting exceeds 0.90 for horizons 1 to 8, whereas
ning 9 forecast hours at the Tunxi data. In particular, the first five the R2 value for the 9 h forecasting is 0.8797

Figure 18 depicts the weights of spatial and temporal changing trend is relatively indistinct and slow due to the
patterns from the proposed model across the Tunxi data. existence of spatial pattern weight. Figure 18b-f illus-
The six moments are represented by the six subfigures, trates the LSTM model’s capability to interpret global
respectively. The hydrological features are represented on spatiotemporal pattern weights over historical moments.
the X-coordinate, while the historical moments are on the Compared with the TD-CNN model, the changing trend
Y-coordinate. The weights of the spatial and temporal pat- of the temporal weight is noticeable. As can be seen, the
terns are comparable to the weight of time, which moves temporal pattern weight also gradually advances as the his-
gradually and slowly. Figure 18a illustrates the TD-CNN torical moments go forward. The TD-CNN-LSTM model
model’s capability to interpret one moment’s local spati- forecasts the delay between rainfall and downstream flow
otemporal pattern weights. However, the overall weight’s (confluence time) accurately because they identify the
Earth Science Informatics

Fig. 17 displays the histogram of residuals for different flood fore- cant prediction errors. The histogram for the TD-CNN-LSTM model
casting models employed at T + 9. Residuals closer to zero indicate exhibits a relatively narrow distribution centered around zero
accurate predictions, whereas larger residuals suggest more signifi-

areas near the Tunxi or Changhua rivers that should have at the different past moments. The spatial and temporal
greater importance than distant areas, given that rainfall pattern weights’ increasing trend is almost linear and
near rivers tends to gather quickly, therefore, increasing simulates the confluence time.
runoff values. Based on Fig. 18, we can preliminarily sum- 2) Depending on the basin area and topography, different
marize the following conclusions: basins may have different time delays between rainfall
and downstream flow.
1) The TD-CNN model condenses on the local spatiotem- 3) Inaccuracies in the weights of spatiotemporal features
poral pattern weights of the current moment, while the may lead to inaccurate forecast results.
LSTM model focuses on global temporal pattern weights
Earth Science Informatics

Fig. 18 This Figure depicts the local and global spatiotemporal hydrological features are represented on the X-coordinate, while the
weights of the TD-CNN-LSTM model on the Tunxi dataset. The historical moments are on the Y-coordinate. Yellow indicates heavier
six moments are represented by the six subfigures, respectively. The weights, while blue indicates lighter weights

Discussion temporal patterns, we recognize the necessity for further

case studies on various basins with different climates and
The Experiment results showed that TD-CNN-LSTM geographical features to determine the extent to which the
performs more adequately than the baseline models in all proposed framework’s ability to generalize across different
indicators used, implying that the proposed model has a region. It is important to note that factors such as hyper-
superior performance in most cases. The model’s capabil- parameters, correlation relationship patterns, network
ity to effectively capture complex spatiotemporal correla- components, and model precision may vary depending
tion patterns in sequential data is attributed to wrapping on the characteristics of the multidimensional time series
the CNN model with the time-distributed layer, thus per- data utilized in different regions. However, we believe
mitting the extraction of local spatiotemporal features over that the proposed TD-CNN-LSTM framework can easily
specific time steps. Upon comparing our results with those improve short-term streamflow prediction for any basins
of previous studies, we observe a significant improvement with inaccurately gauged univariate historical streamflow
in flood prediction accuracy with the TD-CNN-LSTM data. Additionally, the size of the training dataset makes
model. Taking the data in Table 2 as an example, com- a significant difference in the model’s performance. It is
pared with the STA-LSTM model, the average RMSE necessary to improve the prediction accuracy in the small-
and MAE in the Tunxi dataset decreased by 12.5% and size dataset by enhancing its local and global features and
17.3%, respectively. In the Changhua dataset, representa- enhancement performance. Accurate flood forecasting is
tions were 3% and 13.3%, respectively. In addition, the extremely challenging in small and medium basins due
TD-CNN-LSTM model also performs well compared with to the complexity of the hydrological cycle. Given these
the CNN-LSTM model. The average RMSE and MAE in difficulties, it is crucial to investigate and develop deep
the Tunxi dataset decreased by 9.4% and 17.3%, respec- learning methods designed especially for the complexity
tively. In the Changhua dataset, representations were 2.9% of small and medium-sized basins. Future studies should
and 15.3%, respectively. Despite the significant advantages concentrate on exploring the viability of adding more
demonstrated by the TD-CNN-LSTM model in flood hydrological and meteorological data to improve the TD-
peak simulation and understanding complex spatial and CNN-LSTM model’s forecast capability.
Earth Science Informatics

Conclusion Flood analysis and mitigation strategies in Algeria. Wadi

Flash floods: challenges and Advanced approaches for Dis-
aster Risk Reduction, pp 95–118. https://d oi.o rg/1 0.1 007/
In this study, we proposed an intelligent flood forecasting 978-981-16-2904-4
model (TD-CNN-LSTM) that effectively captures the local Chang L-C, Shen H-Y, Chang F-J (2014) Regional flood inundation
and global spatiotemporal correlation patterns between time nowcast using hybrid SOM and dynamic neural networks. J
Hydrol 519:476–489. https://doi.org/10.1016/j.jhydrol.2014.07.
steps of flood time series data to improve model accuracy. 036
Supporting the CNN model with the time-distributed layer Chen P-A, Chang L-C, Chang F-J (2013) Reinforced recurrent neural
(TD) showed good performance at capturing local spatial networks for multi-step-ahead flood forecasts. J Hydrol 497:71–
features, while the LSTM model did better at capturing 79. https://doi.org/10.1016/j.jhydrol.2013.05.038
Chen C, Jiang J, Liao Z, Zhou Y, Wang H, Pei Q (2022) A short-term
global temporal features between subsequences. Therefore, flood prediction based on spatial deep learning network: a case
by integrating these features, the TD-CNN-LSTM model study for Xi County, China. J Hydrol 607:127535. https://doi.org/
showed superior performance at capturing the complex local 10.1016/j.jhydrol.2022.127535
and global spatial and temporal features within flood time Cheng C-t, Niu W-j, Feng Z-k, Shen J-j, Chau K-w (2015) Daily res-
ervoir runoff forecasting method using artificial neural network
series data. The experiment results showed that the proposed based on quantum-behaved particle swarm optimization. Water
model performs better than other comparative models across 7(8):4232–4246. https://doi.org/10.3390/w7084232
the Tunxi and Changhua basins under the MAE, RMSE, R 2, Cho M, Kim C, Jung K, Jung H (2022) Water level prediction model
and MAPE evaluation metrics. The experiment’s results also applying a long short-term memory (lstm)–gated recurrent unit
(gru) method for flood prediction. Water 14(14):2221. https://doi.
imply that the quality and quantity of data are strongly corre- org/10.3390/w14142221
lated with the effectiveness of the training level. In addition, Dang DD, Anh TN (2023) Coupling duo-assimilation to hydrological
motivated by the performance of the existing model, we plan model to enhance flood forecasting. J Appl Water Eng Res: 1–13.
to apply time distribution wrappers of different time series https://doi.org/10.1080/23249676.2023.2201475
Defontaine T, Ricci S, Lapeyre C, Marchandise A, Le Pape E (2023)
structural models in future research and pave the way for Flood forecasting with machine learning in a scarce data layout.
scaling our model to larger basins with complex hydrologi- IOP Conference Series: Earth and Environmental Science. IOP
cal characteristics. Publishing, pp 012020
Dehghani A, Moazam HMZH, Mortazavizadeh F, Ranjbar V, Mir-
zaei M, Mortezavi S, Ng JL, Dehghani A (2023) Comparative
Author contributions Auther Contributions Conceptualization, H.M.; evaluation of LSTM, CNN, and ConvLSTM for hourly short-term
Methodology, H.M ; Investigation, H.M; Writing – original draft, H.M. streamflow forecasting using deep learning approaches. Ecol Inf
J.F; Validation, J.F; Supervision, J.F, Funding acquisition, J.F.; Writ- 75:102119. https://doi.org/10.1016/j.ecoinf.2023.102119
ing – review & editing, H.M, J.F., P.S and Z.A; All authors have read De Gregorio L, Callegari M, Mazzoli P, Bagli S, Broccoli D, Pis-
and agreed to the published version of the menscript. tocchi A, Notarnicola C (2018) Operational river discharge
forecasting with support vector regression technique applied to
Funding This work was supported in part by the following pro- alpine catchments: results, advantages, limits and lesson learned.
jects: The National Key R&D Program of China (Grant No. Water Resour Manage 32:229–242. https://doi.org/10.1007/
2021YFB3900601), Water Conservancy Science and Technology s11269-017-1806-3
Program of Jiangsu (Grant No. 2023044). de la Fuente A, Meruane V, Meruane C (2019) Hydrological early
warning system based on a deep learning runoff model coupled
Data availability No datasets were generated or analysed during the with a meteorological forecast. Water 11(9):1808. https://doi.org/
current study. 10.3390/w11091808
Ding Y, Zhu Y, Feng J, Zhang P, Cheng Z (2020) Interpretable spatio-
temporal attention LSTM model for flood forecasting. Neurocom-
Declarations puting 403:348–359. https://doi.org/10.1016/j.neucom.2020.04.
110
Competing interests The authors declare no competing interests.
Dwiasnati S, Devianto Y (2022) Optimization of Flood Prediction
using SVM Algorithm to determine Flood Prone Areas. J Syst
Conflict of interest the authors declare that they have no conflict of Eng Inform Technol (JOSEIT) 1(2):40–46. https://doi.org/10.
interest. 29207/joseit.v1i2.1995
Farahmand H, Xu Y, Mostafavi A (2023) A spatial–temporal graph
deep learning model for urban flood nowcasting leveraging het-
erogeneous community features. Sci Rep 13(1):6768. https://doi.
org/10.1038/s41598-023-32548-x
References Feng J, Sha H, Ding Y, Yan L, Yu Z (2022) Graph convolution based
spatial-temporal attention LSTM model for flood forecasting.
Bafitlhile TM, Li Z (2019) Applicability of ε-support vector machine 2022 International Joint Conference on Neural Networks (IJCNN).
and artificial neural network for thflood forecasting in humid, IEEE, pp 1–8. https://d oi.o rg/1 0.1 109/I JCNN5 5064.2 022.9 8923
semi-humid and semi-arid basins in China. Water 11(1):85. 71
https://doi.org/10.3390/w11010085 Flack DL, Skinner CJ, Hawkness-Smith L, O’Donnell G, Thompson
Boutaghane H, Boulmaiz T, Lameche EK, Lefkir A, Hasbaia M, RJ, Waller JA, Chen AS, Moloney J, Largeron C, Xia X (2019)
Abdelbaki C, Moulahoum AW, Keblouti M, Bermad A (2022) Recommendations for improving integration in national end-to-
end flood forecasting systems: an overview of the FFIR (flooding
Earth Science Informatics

from intense rainfall) programme. Water 11(4):725. https://doi. memory approach of deep learning for streamflow simulation.
org/10.3390/w11040725 Sustainability 13(23):13384. https://d oi.o rg/1 0.3 390/s u1323 13384
Garg S, Krishnamurthi R (2023) A CNN encoder decoder LSTM Moishin M, Deo RC, Prasad R, Raj N, Abdulla S (2021) Designing
model for sustainable wind power predictive analytics. Sustain deep-based learning flood forecast model with ConvLSTM hybrid
Comput: Inf Syst 38:100869. https://doi.org/10.1016/j.suscom. algorithm. IEEE Access 9:50982–50993. https://doi.org/10.1109/
2023.100869 ACCESS.2021.3065939
Gharbia S, Riaz K, Anton I, Makrai G, Gill L, Creedon L, McAfee M, Montaha S, Azam S, Rafid ARH, Hasan MZ, Karim A, Islam A (2022)
Johnston P, Pilla F (2022) Hybrid data-driven models for hydro- Timedistributed-cnn-lstm: a hybrid approach combining cnn and
logical simulation and projection on the catchment scale. Sustain- lstm to classify brain tumor on 3d mri scans performing ablation
ability 14(7):4037. https://doi.org/10.3390/su14074037 study. IEEE Access 10:60039–60059. https://doi.org/10.1109/
Ghobadi F, Kang D (2022) Improving long-term streamflow prediction ACCESS.2022.3179577
in a poorly gauged basin using geo-spatiotemporal mesoscale data Özdemir C (2023) Avg-topk: a new pooling method for convolutional
and attention-based deep learning: a comparative study. J Hydrol neural networks. Expert Syst Appl: 119892. https://doi.org/10.
615:128608. https://doi.org/10.1016/j.jhydrol.2022.128608 1016/j.eswa.2023.119892
Grimaldi S, Schumann GP, Shokri A, Walker J, Pauwels V (2019) Pareek PK, Srinivas C, Nayana S, Manasa D (2023) Prediction of
Challenges, opportunities, and pitfalls for global coupled floods in Kerala using hybrid model of CNN and LSTM. 2023
hydrologic-hydraulic modeling of floods. Water Resour Res IEEE International Conference on Integrated Circuits and Com-
55(7):5277–5300. https://doi.org/10.1029/2018WR024289 munication Systems (ICICACS). IEEE, pp 01–07. https://d oi.o rg/
Gude V, Corns S, Long S (2020) Flood prediction and uncertainty 10.1109/ICICACS57338.2023.10099867
estimation using deep learning. Water 12(3):884. https://doi.org/ Parisouj P, Mokari E, Mohebzadeh H, Goharnejad H, Jun C, Oh J,
10.3390/w12030884 Bateni SM (2022) Physics-informed data-driven model for pre-
Hamdoun H, Sagheer A, Youness H (2021) Energy time series fore- dicting streamflow: a case study of the Voshmgir Basin, Iran. Appl
casting-analytical and empirical assessment of conventional and Sci 12(15):7464. https://doi.org/10.3390/app12157464
machine learning models. J Intell Fuzzy Syst 40(6):12477–12502. Pierini NA, Vivoni ER, Robles-Morua A, Scott RL, Nearing MA
https://doi.org/10.3233/JIFS-201717 (2014) Using observations and a distributed hydrologic model to
Hu R, Fang F, Pain C, Navon I (2019) Rapid spatio-temporal flood explore runoff thresholds linked with mesquite encroachment in
prediction and uncertainty quantification using a deep learning the Sonoran Desert. Water Resour Res 50(10):8191–8215. https://
method. J Hydrol 575:911–920. https://doi.org/10.1016/j.jhydr doi.org/10.1002/2014WR015781
ol.2019.05.087 Qiao H, Wang T, Wang P, Qiao S, Zhang L (2018) A time-distributed
Ibrahim UA, Dan’azumi S (2020) An overview of some hydrological spatiotemporal feature learning method for machine health moni-
models in water resources engineering systems. Arid Zone J Eng toring with multi-sensor time series. Sensors 18(9):2932. https://
Technol Environ 16(2):285–292 doi.org/10.3390/s18092932
Ji S, Xu W, Yang M, Yu K (2012) 3D convolutional neural networks for Rahman T, Syeed MMA, Farzana M, Namir I, Ishrar I, Nushra MH,
human action recognition. IEEE Trans Pattern Anal Mach Intell Khan BM (2023) Flood prediction using ensemble machine learn-
35(1):221–231. https://doi.org/10.1109/TPAMI.2012.59 ing model. 2023 5th Int Congress Hum Comput Interact Optim
Kan G, He X, Ding L, Li J, Lei T, Liang K, Hong Y (2016) An Robotic Appl (HORA) IEEE, pp 1–6.https://doi.org/10.1109/
improved hybrid data-driven model and its application in daily HORA58378.2023.10156673
rainfall-runoff simulation. IOP Conference Series: Earth and Envi- Raj JR, Charless I, Latheef MA, Srinivasulu S (2021) Identifying the
ronmental Science. IOP Publishing, p 012029. https://doi.org/10. flooded area using deep learning model. 2021 2nd International
1088/1755-1315/46/1/012029 Conference on Intelligent Engineering and Management (ICIEM).
Kisi O (2015) Streamflow forecasting and estimation using least square IEEE, pp 582–586. https://doi.org/10.1109/ICIEM51511.2021.
support vector regression and adaptive neuro-fuzzy embedded 9445356
fuzzy c-means clustering. Water Resour Manage 29:5109–5127. Ruma JF, Adnan MSG, Dewan A, Rahman RM (2023) Particle swarm
https://doi.org/10.1007/s11269-015-1107-7 optimization based LSTM networks for water level forecasting: a
LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard case study on Bangladesh river network. Results Eng 17:100951.
W, Jackel LD (1989) Backpropagation applied to handwritten zip https://doi.org/10.1016/j.rineng.2023.100951
code recognition. Neural Comput 1(4):541–551. https://doi.org/ Salman AG, Kanigoro B, Heryadi Y (2015) Weather forecasting using
10.1162/neco.1989.1.4.541 deep learning techniques. 2015 International Conference on
Li S, Ma K, Jin Z, Zhu Y (2016) A new flood forecasting model based Advanced Computer Science and Information Systems (ICAC-
on SVM and boosting learning algorithms. 2016 IEEE Congress SIS). IEEE, pp 281–285. https://doi.org/10.1109/ICACSIS.2015.
on Evolutionary Computation (CEC). IEEE, pp 1343–1348. 7415154
https://doi.org/10.1109/CEC.2016.7743944 Sankaranarayanan S, Prabhakar M, Satish S, Jain P, Ramprasad A,
Li L, Wang H, Zhang W, Coster A (2024) STG-Mamba: spatial-tempo- Krishnan A (2020) Flood prediction based on weather param-
ral graph learning via selective state space model. arXiv preprint eters using deep learning. J Water Clim Change 11(4):1766–1783.
arXiv:2403.12418. https://doi.org/10.48550/arXiv.2403.12418 https://doi.org/10.2166/wcc.2019.321
Lim FH, Lee W-K, Osman S, Lee ASP, Khor WS, Ruslan NH, Ghazali Schmidhuber J, Hochreiter S (1997) Long short-term memory. Neural
NHM (2022) Multi-model approach of data-driven flood fore- Comput 9(8):1735–1780. https://doi.org/10.1162/neco.1997.9.8.
casting with error correction for large river basins. Hydrol Sci J 1735
67(8):1253–1271. https://doi.org/10.1080/02626667.2022.20647 Shao P, Feng J, Lu J, Zhang P, Zou C (2024) Data-driven and knowl-
54 edge-guided denoising diffusion model for flood forecasting.
Liu R, Ye C, Yang P, Miao Z, Liu B, Chen Y (2022) Short-term predic- Expert Syst Appl 244:122908. https://doi.org/10.1016/j.eswa.
tion model of water level based on ATT-ConvLSTM. 2022 the 5th 2023.122908
International Conference on Data Storage and Data Engineering, Shu X, Ding W, Peng Y, Wang Z, Wu J, Li M (2021) Monthly
pp 85–90. https://doi.org/10.1145/3528114.3528128 streamflow forecasting using convolutional neural network.
Mirzaei M, Yu H, Dehghani A, Galavi H, Shokri V, Mohsenzadeh Water Resour Manage 35:5089–5104. https://doi.org/10.1007/
Karimi S, Sookhak M (2021) A novel stacked long short-term s11269-021-02961-w
Earth Science Informatics

Slimi A, Nicolas H, Zrigui M (2022) Hybrid time distributed CNN- Zhang Z, Zhang Q, Singh VP, Shi P (2018) River flow modelling:
transformer for speech emotion recognition. Proceedings of the comparison of performance and evaluation of uncertainty using
17th International Conference on Software Technologies ICSOFT, data-driven models and conceptual hydrological model. Stochas-
Lisbon, pp 11–13 tic environmental research and risk assessment 32:2667–2682.
Sm B, Kp S (2023) Investigating the potential of a 1-D hydrodynamic https://doi.org/10.1007/s00477-018-1536-y
model for flood inundation modeling and hazard mapping. Coper- Zhou Y, Cui Z, Lin K, Sheng S, Chen H, Guo S, Xu C-Y (2022) Short-
nicus Meetings. https://doi.org/10.5194/egusphere-egu23-362 term flood probability density forecasting using a conceptual
Vilares Ferro M, Mosquera D, Ribadas Pena Y, Darriba Bilbao VM hydrological model with machine learning techniques. J Hydrol
(2023) Early stopping by correlating online indicators in neural 604:127255. https://doi.org/10.1016/j.jhydrol.2021.127255
networks. Neural Netw 159:109–124. https://doi.org/10.1016/j. Zhu Y, Feng J, Yan L, Guo T, Li X (2020) Flood prediction using
neunet.2022.11.035 rainfall-flow pattern in data-sparse watersheds. IEEE Access
Walczykiewicz T, Skonieczna M (2020) Rainfall flooding in urban 8:39713–39724. https://doi.org/10.1109/ACCESS.2020.2971264
areas in the context of geomorphological aspects. Geosciences Zucco G, Tayfur G, Moramarco T (2015) Reverse flood routing in
10(11):457. https://doi.org/10.3390/geosciences10110457 natural channels using genetic algorithm. Water Resour Manage
Wang J, Cao Y, Li J, Ji C (2021) Flood forecasting method of small 29:4241–4267. https://doi.org/10.1109/ACCESS.2021.3065939
and medium-sized watershed based on convolutional neural net-
work. Journal of Physics: Conference Series. IOP Publishing, pp. Publisher’s Note Springer Nature remains neutral with regard to
012083. https://doi.org/10.1088/1742-6596/1757/1/012083 jurisdictional claims in published maps and institutional affiliations.
Won Y-M, Lee J-H, Moon H-T, Moon Y-I (2022) Development and
application of an urban flood forecasting and warning process to Springer Nature or its licensor (e.g. a society or other partner) holds
reduce urban flood damage: a case study of Dorim river basin. exclusive rights to this article under a publishing agreement with the
Seoul Water 14(2):187. https://doi.org/10.3390/w14020187 author(s) or other rightsholder(s); author self-archiving of the accepted
Yang Y, Chui TFM (2019) Hydrologic Performance Simulation of Green manuscript version of this article is solely governed by the terms of
Infrastructures: Why Data-Driven Modelling Can Be Useful? New such publishing agreement and applicable law.
Trends in Urban Drainage Modelling: UDM 2018 11. Springer, pp
480–484. https://doi.org/10.1007/978-3-319-99867-1_82