High Resolution Seismic Waveform Generation Using Denoising Diffusion
High Resolution Seismic Waveform Generation Using Denoising Diffusion
Key Points:
• A novel generative latent denoising diffusion model generates realistic synthetic
seismic waveforms with frequency content up to 50 Hz.
• The model predicts peak amplitudes at least as accurately as local ground mo-
tion models, and with the same variability as in real data.
• We introduce an open-souce Python library for using the pre-trained model, and
to train new generative models.
–1–
manuscript submitted to JGR: Machine Learning and Computation
Abstract
Accurate prediction and synthesis of seismic waveforms are crucial for seismic hazard as-
sessment and earthquake-resistant infrastructure design. Existing prediction methods,
such as Ground Motion Models and physics-based simulations, often fail to capture the
full complexity of seismic wavefields, particularly at higher frequencies. This study in-
troduces a novel, efficient, and scalable generative model for high-frequency seismic wave-
form generation. Our approach leverages a spectrogram representation of seismic wave-
form data, which is reduced to a lower-dimensional submanifold via an autoencoder. A
state-of-the-art diffusion model is trained to generate this latent representation, condi-
tioned on key input parameters: earthquake magnitude, recording distance, site condi-
tions, and faulting type. The model generates waveforms with frequency content up to
50 Hz. Any scalar ground motion statistic, such as peak ground motion amplitudes and
spectral accelerations, can be readily derived from the synthesized waveforms. We val-
idate our model using commonly used seismological metrics, and performance metrics
from image generation studies. Our results demonstrate that our openly available model
can generate distributions of realistic high-frequency seismic waveforms across a wide
range of input parameters, even in data-sparse regions. For the scalar ground motion statis-
tics commonly used in seismic hazard and earthquake engineering studies, we show that
the model accurately reproduces both the median trends of the real data and its vari-
ability. To evaluate and compare the growing number of this and similar ’Generative Wave-
form Models’ (GWM), we argue that they should generally be openly available and that
they should be included in community efforts for ground motion model evaluations.
1 Introduction
The study and prediction of earthquake ground motions are central to seismology.
Wavefield models across scales and frequencies are required to assess seismic hazard and
the response of critical infrastructure to ground motion. State-of-the-art seismic hazard
models use empirical ground motion models (GMMs) to estimate the expected level of
ground shaking (i.e., intensity measures) at a site given an earthquake and site proper-
ties information. Other applications require the prediction of full-time histories of ground
motion at sites of interest. An important example is nonlinear structural dynamic anal-
ysis and performance-based earthquake engineering (Chopra, 2007; Applied Technology
Council, 2009; Smerzini et al., 2024).
–2–
manuscript submitted to JGR: Machine Learning and Computation
Predicting ground motion is challenging, and existing methods for ground motion
synthesis have specific limitations. GMMs are empirical regression models that best fit
the observed data as functions of first- and second-order predictor variables, such as mag-
nitudes, recording distances, faulting mechanisms, and site conditions (Douglas, 2003;
Boore et al., 2014) and are often developed for specific regions or tectonic settings. They
reduce the full wavefield to scalar properties like peak amplitudes or spectral accelera-
tions, and are data-driven rather than physics-based, although the design of GMMs can
be guided by physical considerations, e.g. via the functional form of distance-attenuation
terms (Baker et al., 2021). While the different predicted variables in seismic waveforms
are inherently correlated and physically linked, traditional GMMs have often considered
them independently during the optimization of regression model parameters. Some more
recent models account for these correlations by employing multivariate statistical tech-
niques (Baker & Bradley, 2017; Baker et al., 2021).
–3–
manuscript submitted to JGR: Machine Learning and Computation
Jayalakshmi et al. (2021)). Classic hybrid methods can suffer from parameterization is-
sues when extrapolating between models where the two signals are easily matched but
the phase spectrum is not (Mai & Beroza, 2003; Graves & Pitarka, 2010). Hybrid mod-
els provide the opportunity to generate earthquake scenarios for a limited number of rup-
tures in relatively well-studied regions, typically targeting large-magnitude events and
near-fault distances (e.g. Paolucci, Smerzini, and Vanini (2021)). However, they face sim-
ilar challenges to those inherent to deterministic simulation, namely high computational
cost and uncertainties in source characterization and seismic velocity models (Hartzell
et al., 1999; Douglas & Aochi, 2008).
Some of the current limitations can potentially be overcome with machine learn-
ing techniques. In earthquake engineering, machine learning models have been used for
the prediction of peak ground acceleration and response or Fourier spectra (e.g. Derras
et al. (2012); Esfahani et al. (2021); Jozinović et al. (2022); Lilienkamp et al. (2022)).
Alternatively, neural networks can be used to enhance simulated waveforms. For instance,
Paolucci et al. (2018) generated band-limited waveforms with physics-based earthquake
simulations and then trained an artificial neural network to predict and add higher fre-
quency content via their amplitude spectra. Similarly, Gatti and Clouteau (2020) used
generative adversarial networks (GAN) to extract high-frequency features from ground-
motion records and use them to enhance low-frequency physics-based simulated time se-
ries, while Aquib and Mai (2024) used a combination of GAN and Fourier Neural Op-
erator (FNO) for a similar purpose. GANs have also been used for seismic data augmen-
tation in earthquake detection problems (Y. Li et al., 2020; Wang et al., 2021), and to
train signal/noise discriminators for Earthquake Early Warning algorithms (Z. Li et al.,
2018).
In this study, we present a novel latent denoising diffusion model for the conditional
synthesis of seismic waveforms. The model uses an autoencoder to map spectrogram rep-
–4–
manuscript submitted to JGR: Machine Learning and Computation
2 Methods
Our approach to generating high-resolution seismic waveforms with a latent dif-
fusion model comprises three primary components. Initially, we transform the seismic
waveforms into spectrogram representations, which are more amenable to generative mod-
eling than time-domain signals. Subsequently, we employ a convolutional variational au-
toencoder to compress these high-dimensional spectrograms into a lower-dimensional la-
tent space. Third, we train a denoising diffusion model based on a U-Net architecture
to generate samples within this latent space, which are then mapped back to the spec-
trogram representation and converted to waveforms during inference. In this section we
provide a comprehensive description of each component of the generative pipeline (Fig-
ure 1). Detailed explanations of the neural network architectures, which we omit here
for brevity, are given in Appendix A1.
–5–
manuscript submitted to JGR: Machine Learning and Computation
The stochastic nature of the training objective in denoising diffusion models (see
Section 2.3), along with their iterative sampling process, makes both training and infer-
ence computationally intensive, particularly with high-dimensional data like high-frequency
seismic waveforms. To mitigate this, we adopt the two-stage approach of Rombach et
al. (2022). First, we train an autoencoder to compress the data into a more manageable,
lower-dimensional latent space, then use denoising diffusion to generate the latent vari-
ables. This combines the autoencoder’s data compression efficiency with the generative
capabilities of denoising diffusion models.
Formally, let pdata denote the data distribution density. We model the distribution
over latent variables z ∈ Rm as a mixture of Gaussians:
Here, the encoder is defined by its mean and standard deviation functions µϕ , σϕ : Rn →
Rm , parameterized by a neural network with parameters ϕ. The network has two out-
put heads: one for the mean and one for the standard deviation. A deterministic decoder,
Gψ : Rm → Rn , maps latent variables back to the original data space. The encoder
and decoder are trained jointly to minimize the reconstruction loss
2
E ∥x − Gψ (z)∥ (2)
penc (z|x)
over the data distribution. Additionally, we regularize the latent space by the Kullback-
Leibler divergence between the encoder distribution and a standard normal distribution.
Since we employ the autoencoder solely for data compression rather than as a genera-
tive model, we set the regularization strength to a tiny value (1e−6) to ensure high re-
construction quality.
–6–
manuscript submitted to JGR: Machine Learning and Computation
the backward process commences from this noise distribution and aims to invert the ef-
fect of the forward process, recovering samples that approximately follow the data dis-
tribution. For conditional generation, both processes are modeled as conditional processes.
We parameterize the backward process by a neural network, which is trained to predict
the original sample, or equivalently the added noise, from the perturbed sample and the
conditioning parameters.
Specifically, our data distribution is a mixture of distributions conditioned on the
first-order properties of the waveforms (earthquake magnitude, hypocentral distance, site
conditions, and faulting type), represented as a vector c ∈ Rc :
We focus on modeling the conditional distributions pdata (x|c), assuming the conditional
parameters are given at inference. For a fixed conditioning vector c, we obtain a latent
sample z ∼ Epdata (x|c) penc (z|x) by passing a sample from the conditional data distri-
bution pdata (x|c) through the stochastic encoder. We denote the corresponding noise-
perturbed samples at time 0 < t ≤ T as zt . These samples are obtained by evolving
z = z0 through the forward process described by the Itô stochastic differential equa-
tion
dzt = µt (zt ) dt + σt dw, (4)
where µt : Rm → Rm is a time-dependent drift function, σt > 0 is a time-varying dif-
fusion coefficient, and w denotes the standard Wiener process. For sufficiently large T ,
this process converges to a Gaussian distribution. Synthetic data can then be generated
by sampling from this Gaussian distribution and evolving the sample back to the latent
representation distribution using the backward process
dzt = [µt (zt ) − σt2 ∇zt log pt (zt |c)] dt + σt dw̄, (5)
where w̄ denotes the standard Wiener process with reversed time direction, and pt (zt |c)
is the density of the sample zt given the conditioning c. The score function ∇zt log pt (zt |c)
is generally intractable but can be approximated using denoising score matching loss (Vincent,
2011; Song et al., 2021).
Song et al. (2021) also identify a deterministic process with the same marginal dis-
tribution as the reverse process, characterized by the probability flow ordinary differen-
tial equation:
1
dzt = [µt (zt ) − σt2 ∇zt log pt (zt |c)] dt. (6)
2
In practice, integrating this deterministic process enables efficient sampling as simulat-
ing a stochastic process typically requires more time discretization steps.
For our GWM, we adopt the parameterizations proposed by Karras et al. (2022)
which excel in image generation tasks.
√ Specifically, we utilize their variance exploding
version by setting µt = 0 and σt = 2t. The forward and backward processes are then
given by
√ √
dzt = 2t dw, dzt = −2t∇zt log pt (zt |c) dt + 2t dw̄, (7)
Integrating the forward process from time 0 to t yields the transition distribution pt (zt |z0 , c) =
pt (zt |z0 ) = N (zt | z0 , t2 I). We can sample from this distribution by adding Gaussian
noise with a variance of t2 to the original sample z0 . For normalized data, as T becomes
–7–
manuscript submitted to JGR: Machine Learning and Computation
4 Results
The design goal of the Generative Waveform Model (GWM) is to synthesize ground
motion records that are statistically indistinguishable from real records, across a wide
range of frequencies and conditioning parameters, namely source magnitudes, hypocen-
tral distances, VS30 , and faulting type. In the following section, we discuss the extent
–8–
manuscript submitted to JGR: Machine Learning and Computation
to which the Denoising Diffusion Model can achieve that goal and compare its outputs
(i) to real data and (ii) to commonly used Ground Motion Models (GMMs). First, we
compare the distributions of time-domain signal envelopes (Section 4.1) and of Fourier
Amplitude Spectra (Section 4.2) between the real data and the GWM synthetics. Next,
we evaluate how well the GWM predicts scalar ground motion intensity measures, in terms
of prediction accuracy (Section 4.3.1), and variability (Section 4.3.2). Then we compare
shaking durations of real data and GWM synthetics (Section 4.3.3), and the scaling be-
tween peak amplitude statistics and the conditioning predictor variables (Sections 4.3.5
- 4.3.7). Finally, we evaluate relative and absolute model performances for different mag-
nitude and hypocentral distance ranges, by computing average model probabilities (Sec-
tion 4.4), and performance metrics from the image generation domain (Section 4.5).
For each real seismogram, we use the trained model to produce a number of cor-
responding synthetic seismograms with the conditioning parameters of the real record,
i.e. their magnitude, hypocentral distance, VS30 value and faulting type. We can then
directly compare the real seismograms to their corresponding synthetic realizations (Fig-
ure 2). The GWM synthetics appear to capture the first-order characteristics of the real
seismograms: they have clear P- and S-phases with realistic phase arrival time differences,
as well as realistic coda wave decays. For the same conditioning parameter choices, there
is a considerable amount of variability between individual realizations, for example in
terms of peak amplitudes. In Section 4.3.2 we show that this variability closely matches
the variability observed in the real data.
–9–
manuscript submitted to JGR: Machine Learning and Computation
To compare the real and synthetic waveforms quantitatively, we compute signal en-
velope time series for both sets. The signal envelopes are obtained by taking the mov-
ing average of the absolute waveform signals with a kernel size of 128, followed by a log-
arithmic transformation. This comparison shows that the GWM synthetics have very
similar first-order shapes in the time domain compared to the real seismograms, across
the entire range of magnitudes and recording distances for which the model was trained
(Figures 3a and 3b). The low-noise amplitudes before the P-wave onsets are followed by
an impulsive P-wave amplitude increase. This amplitude growth is similar for both small
and large magnitudes until the smaller magnitude records reach their maximum P-wave
amplitude, whereas the large magnitude records continue to grow. The later-arriving S-
and surface waves cause additional amplitude growth in the real waveforms, which is ac-
curately mimicked by the GWM synthetics. The variability of the envelopes in each bin
is symmetric around the median in log-space and is of the same order for both real and
synthetic data. Additional figures in the supplementary material show different bins, and
separate evaluations of North-South and vertical components (supplementary Figures
S1 to S7).
Figure 3: First order seismogram characteristics. (a) and (b) Distribution of time-domain
envelopes for East-West component seismograms in different magnitude bins in terms of
the mean (solid line) and the standard deviation (shaded areas) for real data and GWM
synthetics. (c) and (d) Distribution of Fourier spectra log-amplitudes for East-West com-
ponent seismograms in different magnitude bins. The sample counts for each bin, in
ascending order of magnitude, are 58.302, 42.264, 39.033, 44.232, 13.539.
–10–
manuscript submitted to JGR: Machine Learning and Computation
For earthquake engineering and seismic hazard applications, peak ground motion
amplitude statistics are of particular importance. Here, we compare how various peak
amplitude statistics of the GWM synthetics compare with the real data, how they cor-
relate with the conditioning predictor variables, and how they compare with predictions
from widely used Ground Motion Models (GMMs). Specifically, we compute peak ground
acceleration (PGA), peak ground velocity (PGV), and pseudo-spectral acceleration (SA)
for both real data and GWM synthetics. We use the orientation-independent GMRotD50
statistic (Boore et al., 2006), which represents the median of the horizontal components,
rotated over all possible rotation angles. We utilize GMMs from Boore et al. (2014), op-
timized for a global database of shallow crustal earthquakes in active tectonic regions
with M 3.0–7.9 events, and from Kanno et al. (2006), which used a database of strong
ground motion records from shallow crustal earthquakes in Japan between 1963 and 2003.
–11–
manuscript submitted to JGR: Machine Learning and Computation
Figure 4: Model bias as a function of hypocentral distance for the generative waveform
model (blue), GMMs by Boore et al. (2014) (violet), and Kanno et al. (2006) (red) for
PGA (a, c, and e) and PGV (b, d, and f), with respect to real data. Colored lines repre-
sent the mean of the ratio in 50 distance bins of 3.67 km width. The bars represent the
standard deviation in each bin.
–12–
manuscript submitted to JGR: Machine Learning and Computation
Another important criterion for ground motion prediction methods is that the vari-
ability in predicted peak amplitudes is accurately characterized. To evaluate the vari-
ability of the GWM predictions, we compare their total standard deviation to the vari-
ability in the real data, and to the predictions of the two GMMs.
To measure prediction residuals, we fit a simple, custom GMM to the PGA and
PGV of the real data, as a function of magnitude M , hypocentral distance R, and VS30 .
We use ordinary least squares and find
–13–
manuscript submitted to JGR: Machine Learning and Computation
Figure 5: Histogram of (a) PGA and (b) PGV residuals showing the spread of the real
data (black), of the GWM synthetics (red), of the Boore (2003) GMM (blue), and of
the Kanno et al. (2006) GMM (green), with respect to the simple fitted ground motion
models (equations 10 and 11) on a log10 scale. The box plot shows the median values,
quantiles, and extreme values.
Figure 6: Shaking duration estimated using cumulative Arias Intensity (cAI). (a) cAI for
a real example waveform (black line) and 100 GWM synthetics (red lines). Triangle-right
and triangle-down symbols represent 5% and 95% of the maximum cAI for the real data
(white) and GWM synthetics (red), respectively. (b) Shaking duration for real data (grey
circles) and one GWM synthetic per real record (red triangles), with corresponding condi-
tioning parameters. For each magnitude bin (every 0.08) from M 4.5 - 7.4, grey dots and
lines show the mean and standard deviation of the real data, while blue triangles and lines
show the mean and standard deviation of the GWM synthetics.
–14–
manuscript submitted to JGR: Machine Learning and Computation
with GMMs. For instance, we can generate n synthetic waveforms for a set of condition-
ing parameters, and then compute the median and standard deviation of, e.g., PGA. It
takes on the order of 60 GWM realizations for the median and the standard deviation
estimates to stabilize (Figure 7a). To establish this we generate between 1 and 100 re-
alizations using M = 5.5, R = 50 km, VS30 = 500 m/s, and faulting type = 1 as con-
ditioning parameters, and analyze the median and standard deviation of the correspond-
ing PGA values. Furthermore, we compute the Shapiro-Wilk test statistic (Figure 7b),
to confirm that the peak amplitude predictions from the GWM are indeed log-normally
distributed, as is the case for real data. This test statistic compares a data distribution
to a normal distribution by evaluating (Σni=1 ai x(i) )2 /Σni=1 (xi − x̄)2 (i.e., the ratio be-
tween the square of the sorted weighted sample values to the sum of the squared sam-
ple deviations from the mean; where ai is the weight and xi is the sample). A value close
to 1 indicates normally distributed data and values close to 0 imply non-normal distri-
bution. The test values for this particular set of conditioning parameters exceed 98% at
n > 40 and then remain stable, suggesting that the model has correctly learned to gen-
erate waveforms with log-normally distributed peak amplitudes. This is a first-order char-
acteristic of real data, and it also implies that we can accurately represent the ampli-
tude distributions with only a mean and a standard deviation.
In a similar sense, we can investigate how the GWM predictions grow with mag-
nitude. We produce 100 GWM synthetics for a vector of magnitudes, evenly spaced ev-
ery 0.035 from M 4.5 to 8, for a fixed distance and VS30 values. The predictions for SA
at T = 0.1 s show relatively smooth, monotonous growth, up to a saturation at M 7.0
(Figure 9a), consistent with the (very sparse) real data. The same trend is observed for
other conditioning parameter combinations (supplementary Figures S19 - S24). It is in-
teresting, and encouraging, that the GWM predictions are well-behaved in condition-
ing parameter ranges where the training data are very sparse, such as at M 6.5 - 7.5,
or for R < 20 km.
–15–
manuscript submitted to JGR: Machine Learning and Computation
Figure 7: Statistics of the GWM realizations. a) Median and standard deviation of PGA
values with different numbers of GWM realizations. b) Shapiro-Wilk test statistic for dif-
ferent numbers of realizations.
–16–
manuscript submitted to JGR: Machine Learning and Computation
–17–
manuscript submitted to JGR: Machine Learning and Computation
150 m/s to 1500 m/s (Figure 9b). Generally, the GWM predictions follow the real data
distribution, with SA values decreasing with increasing VS30 . Towards very low VS30 , the
GWM synthetics somewhat underpredict the strong growth of SA. Interestingly, for SA
at T ≥ 1.0 s, the SA values decrease up to VS30 = 800 m/s, then remain stable up to
approximately VS30 = 1200 m/s, and then decrease again for larger VS30 values. This
is observed for a wide range of magnitude and distance bins (supplementary Figures S25-
S30). To what extent we expect SA to correlate with VS30 somewhat depends on the known
limitations of VS30 as a site response proxy (Bergamo et al., 2019).
where µk and σk are the predicted mean and standard deviation for the k th bin. This
probability is high only if the model accurately predicts both the mean and standard de-
viation of the data. Therefore, with these probabilities, we can readily assess the agree-
ment between the data and model predictions, for a large number of conditioning pa-
rameter combinations.
We compute these model probabilities for the GWM (PGW M ) and for the two GMMs
(PGM M ). We use the GMMs to compute the mean SA for the center of each bin, and
use their reported standard deviations. For the GWM, we use 100 GWM realizations
to compute the mean and standard deviation for each bin, likewise using the magnitude
and distance of each bin center. We use faulting type = 1, and repeat the computations
for a number of VS30 values, to compute the average probabilities for each bin after Eq.
12. Figure 10 shows the model average probabilities for the GMM by Kanno et al. (2006)
(top row) and the GWM (middle row). To compare the two models we compute the prob-
ability ratio r = PGM M /PGW M (bottom row). Equivalent figures for the Boore et al.
(2014) GMM are shown in supplementary Figures S31, S35, and S39.
In general, for the GWM and both GMMs, the average probabilities are highest
for low magnitudes and large hypocentral distances, and lowest for large magnitudes and
short recording distances. For VS30 = 240 m/s and SA with T = 1.0 s (Figure 10a),
the GMM by Kanno et al. (2006) shows comparable accuracy to the GWM in most bins,
except for R > 100 km. The GMM by Boore et al. (2014) is comparable to the GWM
for small magnitudes and short hypocentral distances, or large magnitudes and large hypocen-
tral distances (supplementary Figure S31a). At periods T = 1.0 s, the GWM performs
better than any of the GMMs considered in this study (supplementary Figures S31 and
S32). These patterns are also observed for higher VS30 values (Figures 10b,10c and sup-
plementary Figures S33 - S40).
–18–
manuscript submitted to JGR: Machine Learning and Computation
Figure 10: Average model probabilities given the SA data at T = 1.0 s, in bins of magni-
tude and R, for VS30 = 240 m/s (a), 800 m/s (b), and 1360 m/s (c), for the Kanno et al.
(2006) GMM (top row) and the GWM (middle row). The ratio between the two probabil-
ities (bottom row) shows which model explains the observed SA data better.
performance metrics from the image generation community, and could play an impor-
tant role for a systematic and quantitative comparison of various proposed GWMs, e.g.
in the framework of future community efforts for GWM evaluations.
Computing the FD for the entire data set, we find that it ranges from 64 to 78 for
the three spatial components of the seismograms (Table 1). To provide a baseline for these
FD values, we also compute the FD between the training and test sets (Section 3), i.e.
between sub-sets of the real data. These resulting FDs are 3 to 4 times smaller. This pro-
–19–
manuscript submitted to JGR: Machine Learning and Computation
vides a baseline for ideal model performance, and suggests that, despite the good per-
formance shown in sections 4.1 - 4.4, there are significant differences between real and
synthetic spectra and that there is, therefore, room for model improvement. Furthermore,
we also use this FD metric to show that the model performance decreases substantially
if we leave out the auto-encoder, or if we choose a signal representation other than spec-
trograms (see ablation studies in Appendix B).
Furthermore, we can use the FD to compare the waveform generation for differ-
ent magnitude and distance bins (Figure 11a and supplementary Figure S42). The FDs
are systematically higher for larger magnitude and shorter distance recordings, indicat-
ing a poorer fit between real and GWM synthetic data. This may be due to the relative
scarcity of training data, and/or due to the inherently higher complexity of these records,
making them more challenging for a model to mimic.
Table 1: Fréchet Distances (FD) of the Fourier amplitude spectra, classifier accuracies
and FDs of classifier embeddings, between real data and GWM synthetics. The metrics
are computed between the real data and corresponding GWM synthetics (first row), and
between the training and test sets (second row).
Figure 11: Fréchet Distance (FD) between real data and GWM synthetics in bins of
magnitude and hypocentral distance, for the Fourier amplitude spectra of the East-West
component seismograms (a), classifier accuracy on the GWM synthetics (b), and FD of
the classifier embeddings between real data and GWM synthetics, in bins of magnitude
and hypocentral distance (c).
Inspired by the common practice in image synthesis to evaluate the quality of gen-
erated images with a pre-trained classifier (Heusel et al., 2017), we adopt a similar ap-
proach: we train a classifier to categorize seismic data into bins of magnitude and dis-
tance. We divide our dataset into five magnitude and five distance bins, resulting in 25
classes, with each class containing a similar number of samples (Appendix A3). We then
train a classifier to predict the magnitude-distance bin for each record. For the classi-
fier, we use a convolutional neural network (CNN) architecture that, like the GWM, op-
erates on spectrogram representations of the waveforms (Appendix A1). The classifier
–20–
manuscript submitted to JGR: Machine Learning and Computation
achieves a test accuracy of 57.67%, correctly predicting the magnitude-distance bin for
this fraction of real waveforms. When applied to synthetic waveforms, the classifier’s ac-
curacy is 44.48% (Table 1). Ideally, if synthetic waveforms were indistinguishable from
real ones, the classifier would maintain the same accuracy for both. Although not per-
fect, the classifier’s performance on synthetic waveforms strongly exceeds the 4% = 1/25
accuracy expected from random guessing, indicating that the synthetic waveforms en-
capsulate substantial information about first-order statistics, enabling the classifier to
make informed predictions. Figure 11b illustrates the accuracy achieved on the gener-
ated dataset across different magnitude-distance bins. Notably, some of the most diffi-
cult parameter ranges have high classification accuracy.
For similar input data, not only should the classifiers’ performance be compara-
ble, but its internal representations should also align. Based on this intuition, we use the
FD to measure the similarity of the classifiers’ hidden representations between the real
and synthetic waveforms. We collect the 256-dimensional hidden representations from
the penultimate layer of the classifier for both real and synthetic waveforms. Unlike the
FD of Fourier amplitude spectra (Section 4.5.1) where each dimension was treated in-
dependently, the reduced dimensionality of this representation (256 instead of 2033) al-
lows us to compute correlations between dimensions. Thus, we calculate the entire co-
variance matrix Σ of the hidden representations for both the real and generated wave-
forms. The Fréchet Distance D between the two sets of hidden representations is then
given by:
D2 = ∥µobs − µgen ∥2 + Tr(Σobs + Σgen − 2(Σobs Σgen )1/2 ). (14)
This is a generalization of the Fréchet Distance in Eq. 13 to general multivariate (non-
isotropic) Gaussians. In image generation literature, this metric is known as the Fréchet
Inception Distance (FID) (Heusel et al., 2017), with ”Inception” referring to the clas-
sifier architecture employed.
This FD of classifier embeddings is about 20 times larger between the GWM syn-
thetics and the real data, than it is between test and training data (Table 1). When com-
puted separately for the magnitude-distance bins, we see again how the model performs
worse for larger magnitude records (Figure 11c), and how the presented model performs
significantly better than the ablated models evaluated in Appendix B.
In summary, all three metrics provide an objective, relative measure of synthesis
quality. As such they can readily be used to e.g. compare different proposed GWMs, model
versions, or to evaluate the model performance on a particular subset of the data. Im-
portantly, ideal lower bounds for the FD, and ideal upper bounds for the classifier can
be estimated by computing the metrics using just real data (second row in Table 1).
5 Discussion
Generative Waveform Models (GWMs) are rapidly advancing and have the poten-
tial to significantly improve earthquake hazard assessment and earthquake engineering
studies (Florez et al., 2022; Esfahani et al., 2023; Shi et al., 2024; Matsumoto et al., 2024).
Unlike GMMs, which predict scalar ground motion metrics, GWMs can synthesize fully
realistic waveforms, complete with realistic frequency- and time-domain properties.
The ability to predict full waveforms enables studies that rely on waveform con-
tent, such as building response simulations (Bommer & Acevedo, 2004). Any scalar ground
motion metric can be derived from the predicted waveforms. A key advantage of this ap-
proach is that the waveforms and their derivatives are equally realistic across the entire
frequency range (1 - 50 Hz). This may contrast with hybrid methods, which add high-
frequency spectra in a separate second stage, either using stochastic methods (Saikia &
–21–
manuscript submitted to JGR: Machine Learning and Computation
Somerville, 1997; Mai et al., 2010; Graves & Pitarka, 2010) or neural networks (Paolucci
et al., 2018; Gatti & Clouteau, 2020; Okazaki et al., 2021). The former can introduce
artifacts at the transition between the deterministic and stochastic parts of the predic-
tions (Mai & Beroza, 2003; Graves & Pitarka, 2010; Tang & Mai, 2023).
Another advantage of GWMs is their ability to accurately learn the correlation be-
tween scalar statistics of the same waveform (e.g., spectral accelerations at different fre-
quency ranges). This factor has been considered in only a few GMMs to date (Baker &
Bradley, 2017; Baker et al., 2021).
This study introduces a new GWM that builds on the recent successes of Denois-
ing Diffusion Models in image, audio, and video generation (Song & Ermon, 2019; Ho
et al., 2020; Song et al., 2021; Dhariwal & Nichol, 2021; Nichol & Dhariwal, 2021; Kong
et al., 2021; Rombach et al., 2022; Ho et al., 2022). Diffusion models are capable of gen-
erating high-resolution and diverse samples while being simple to implement and train.
Unlike GANs (Florez et al., 2022; Esfahani et al., 2023; Matsumoto et al., 2024) used
for synthesizing waveforms, denoising diffusion models do not suffer from mode collapse
or related training challenges. Additionally, no dedicated neural network architecture
is required, as with neural operators (Shi et al., 2024).
GWMs could eventually eliminate the need to re-scale limited data pools of observed
seismic waveforms to match expected target spectra, or to use hybrid methods for en-
riching simulated waveforms with high-frequency seismic energy. Instead, a set of fully
realistic broadband waveforms can be generated from a single, self-consistent synthesis
process.
–22–
manuscript submitted to JGR: Machine Learning and Computation
One important question in this context will be which among existing and future
GWMs generates the most realistic waveforms for a particular application. To facilitate
quantitative and representative model comparisons, we propose that the emerging GWM
community embrace open model and code standards and participate in existing commu-
nity efforts for the comparison and evaluation of ground motion models (e.g., Maechling
et al. (2015)). Performance metrics like the ones introduced in Section 4.5 could facil-
itate meaningful comparison between models and model versions. Such model compar-
ison efforts could also include blind signal classification exercises, where trained classi-
fier models would attempt to distinguish between real and synthetic waveforms.
5.2 Limitations
While the presented GWM arguably achieves high seismic waveform synthesis per-
formance, there are several limitations, that future models can aspire to overcome.
Limited training data As is the case for all models of strong ground motion, the
limited number of short distance recordings of large magnitude earthquakes is a bottle-
neck. This limitation affects the synthesis performance of this crucial data regime. GWMs
can in principle be used to augment such data sets, but it is currently an open question
how well the models extrapolate beyond the parameter ranges for which they have been
trained, and how well they perform at the data-scarce edges of the parameter ranges.
Point source assumption Our model assumes that the earthquake source is a point
source and neglects finite fault source characteristics such as fault geometry and distance,
source roughness, directivity, and unilateral or bilateral rupture modes.
Uncorrelated stations The current model does not explicitly take into account
the correlation of observations across different records of the same quake. Each gener-
ated waveform is an independent realization of the denoising forward process. This may
lead to an underestimation of the correlation of observed ground motions from the same
quake, and might limit the ability of the model to generalize to new stations.
P-wave onset times The current model has been trained with a data set of wave-
forms that have been aligned with a simple STA/LTA onset detector, which can be in-
accurate. As a consequence, the GWM synthetics also have some variability in the P-
wave onset times that is not physically meaningful.
Signal length The 40-second long seismograms are sufficient to describe the ground
motion from quakes with magnitudes of up to ∼ 7.5. For even larger quakes, the source
–23–
manuscript submitted to JGR: Machine Learning and Computation
duration alone may exceed this signal length. Producing longer sequences without com-
promising temporal resolution presents some challenges, even for our efficient model. Ad-
dressing this issue may necessitate an approach with more favorable asymptotic behav-
ior, which is a subject for future research.
Lower spectral amplitude Our model slightly underestimates the spectral ampli-
tude of the ground motion compared to the real data (Figures 3b and 4a). This discrep-
ancy is observed exclusively in the model operating on the spectrogram representation.
We hypothesize that this is due to the model generating a slightly blurred version of the
encoded spectrograms, similar to the effect of a Gaussian filter. While this blurring is
inconsequential for image generation tasks, as it is imperceptible to the human eye, it
may result in a lower spectral amplitude (e.g., averaging 0.04 m/s2 Hz−1 at frequency <
30 Hz, (supplementary Figure S41)). Partially, the underestimation may also stem from
the spectrogram autoencoder (Figure B1c). A potential solution could involve incorpo-
rating additional loss terms, such as adversarial loss, to encourage the model to gener-
ate sharper spectrograms. These could for instance be included in the autoencoder stage.
Alternatively, exploring different, potentially smoother spectral representations that are
less sensitive to blurring may also be beneficial.
6 Conclusion
We present a data-driven, conditional generative model for synthesizing three-component
strong motion seismograms. Our generative waveform model (GWM) combines a con-
volutional auto-encoder with a state-of-the-art latent denoising diffusion model, which
generates encoded - rather than raw - spectrogram representations of the seismic signals.
We trained the openly available model on Japanese strong motion data with hypocen-
tral distances of 1–180 km, moment magnitudes ≥ 4.5, and VS30 values of 76–2100 m/s.
Using a variety of commonly used and novel evaluation metrics, we demonstrate that the
GWM synthetics accurately capture the statistical properties of the observed data in both
the time and frequency domains, across a wide range of conditioning parameters, and
up to the highest hazard-relevant frequencies.
With GWMs, hazard models can potentially expand their scope to include appli-
cations that require full waveform representations, rather than just scalar amplitude statis-
tics. Future community efforts to benchmark and compare GWMs would provide guid-
ance for which models to best use in practical and scientific applications, and may ac-
celerate GWM innovation.
–24–
manuscript submitted to JGR: Machine Learning and Computation
pre-layer group normalization and SiLU activation functions. In addition, the downsam-
pling component uses a convolutional layer to reduce the dimensionality of an input be-
tween each pair of residual blocks, while the upsampling component uses upsampling op-
erations to double the dimensionality of an input between each pair of residual blocks.
An overview of the architectures is given in Table A1.
Denoising Diffusion: The neural network for the diffusion model uses four resid-
ual blocks on encoder and decoder components, with an additional residual block in the
middle. After the first three levels, we include a downsampling operation on the encoder
an upsampling operation on the decoder side. In addition, the central blocks incorpo-
rate a self-attention module. As per convention, conditioning information is injected within
each residual block by concatenating projections of the conditioning vector c and the time
embedding t to the intermediate representations of an input zt (see Figure A1 and Song
et al. (2021); Karras et al. (2022)).
inputst embeddingt
LayerNormalization
SiLU
t c
Conv2D Linear
FourierProjection FourierProjection
Reshape
MLP MLP
LayerNormalization
SiLU
embeddingt Dropout
Conv2D
outputst
Figure A1: Using low-dimensional features for conditioning the diffusion model. (a) For
both the scalar value t and the four-dimensional feature vector c, we first compute 256-
dimensional Fourier features and embed both via separate MLP neural networks. The
two embeddings are combined by simply adding them elementwise. (b) For each residual
block, we condition the synthesized spectrogram (inputst ) using the combined time-
feature embedding by adding them to the hidden representation of the residual layer. For
that, we take the 256-dimensional embedding vector, transform it through a linear layer
with K output neurons, and reshape it to match the size of the hidden representation.
Concretely, if the hidden representation has dimensionality N × H × W × K where N is
the batch size and H × W is the spectrogram size, we repeat the embedding N × H × W
times, reshape the resulting tensor to match the dimensionality of the hidden represen-
tation, and add the hidden representation and conditioning information elementwise (in
deep learning libraries like PyTorch this can be efficiently done).
–25–
manuscript submitted to JGR: Machine Learning and Computation
Hyperparameters Moving Average Diffusion Moving Average Latent Diffusion Spectrogram Diffusion Spectrogram Latent Diffusion Classifier
Autoencoder Diffusion Model Autoencoder Diffusion Model
Convolution Kernel Size 5 5 5 3×3 3×3 3×3 3×3
Hidden Channels [64, 128, 256, 256] [64, 128, 256] [64, 128, 256, 256] [64, 128, 256, 256] [64, 128, 256] [64, 128, 256, 256] [64, 128, 256, 256]
Attention Levels [4] - [4] [4] - [4] [4]
Dropout Rate 0.1 0.1 0.1 0.1 0.1 0.1 0.1
KL Weight - 10−6 - - 10−6 - -
Optimizer Adam Adam Adam Adam Adam Adam Adam
Learning Rate 10−4 10−4 10−4 10−4 10−4 10−4 10−4
EMA Decay 0.999 0.999 0.999 0.999 0.999 0.999 0.999
Batch Size 320 64 1536 320 64 2048 128
Epochs 300 200 300 300 200 300 100
Table A1: Hyperparameters for the various models used in the experiments.
A2 Representations
We experiment with two different representations of the seismic data: spectrogram
and moving average envelope.
Spectrogram Representation: To transform each of the three channels in the
original waveform into a spectrogram, we utilize a Short-Time Fourier Transform (STFT)
with 256 frequency bins and a hop length of 32 samples. Due to the symmetry of the
spectrogram, only half of the frequency bins are used. To prevent padding issues, the
original waveform is truncated to 4064 samples, resulting in a complex-valued matrix of
size 128×128. We then take the magnitude of this matrix and apply a logarithmic trans-
formation to obtain the spectrogram, discarding the phase information due to its high-
frequency nature, which is challenging to model accurately. To reconstruct the original
waveform, we employ the Griffin-Lim algorithm (Griffin & Lim, 1984; Perraudin et al.,
2013), which reliably estimates the phase from the magnitude spectrogram.
Magnitude Phase
0 0
0.05
32 32
Frequency bins
Frequency bins
Amplitude
0.00
64 64
−0.05 96 96
0 10 20 30 40
Time [s] 0 32 64 96 0 32 64 96
Time bins Time bins
–26–
manuscript submitted to JGR: Machine Learning and Computation
of length 128. The final representation is the concatenation of the original waveform di-
vided by the envelope and the logarithm of the envelope.
50000
0
200
175
125
11858 9406 8984 10437 3202
Figure A3: Binning of the data into classes for the classifier, based on magnitude and
distance.
–27–
manuscript submitted to JGR: Machine Learning and Computation
lights the effectiveness of the spectrogram representation for the task of ground motion
synthesis.
Finally, we assess the performance of the diffusion model trained on different rep-
resentations and the impact of incorporating the autoencoder stage. Table B1 summa-
rizes our findings, showing that the spectrogram representation significantly outperforms
the envelope representation across all metrics. Additionally, the autoencoder stage im-
proves the spectral fit. Overall, the latent diffusion approach with the spectrogram rep-
resentation is the most effective configuration.
Predicted Predicted
0 Target 0 Target
Log-Amplitude [m/s2 H 1]
Log-Amplitude [m/s2 H 1]
2 2
4 4
6 6
8 8
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
Predicted
0 Target
Log-Amplitude [m/s2 H 1]
2
4
6
8
0 10 20 30 40 50
Frequency [Hz]
(c) Spectrogram
Figure B1: Fourier spectra log-amplitude comparison between the input and output of
an autoencoder trained on different representations. The East-West component of the
three-channel signal is used for visualization.
–28–
manuscript submitted to JGR: Machine Learning and Computation
Table B1: Ablation study comparing the performance of the diffusion model when trained
on the moving average envelope and the spectrogram representation. Results are shown
for both direct training on representations and training on the latent space of an autoen-
coder. The Fréchet Distance (FD) for the log-amplitude Fourier spectra and classifier
embeddings is reported between the full data distribution and the generated samples.
Classifier accuracy is reported for the generated samples.
–29–
manuscript submitted to JGR: Machine Learning and Computation
The code for our generative waveform model is available on GitHub at https://
github.com/highfem/tqdne/tags (Bergmeister et al., 2024). All online pages were last
accessed on October 16th , 2024. The supplementary material provides additional infor-
mation and figures to complement the main content of the primary text, offering a deeper
understanding and further validation of the presented results.
Acknowledgments
This work was supported by grant number C22-10 (HighFEM) of the Swiss Data Sci-
ence Center (SDSC), Ecole Polytechnique Fédérale de Lausanne and ETH Zürich awarded
to M-A. Meier, L. Ermert and M. Koroni. L. Ermert is supported by Swiss National Sci-
ence Foundation grant 209941. M. Koroni is supported by the Swiss Federal Nuclear Safety
Inspectorate (ENSI) under contract number CTR00830. We thank Donat Fäh and Paolo
Bergamo for useful discussions on ground motion models and earthquake engineering.
We thank CSCS Swiss National Computing Center (Piz Daint under projects sd28 and
s1165) and Swiss Seismological Service “Bigstar” Cluster for providing computational
resources for this research.
References
Applied Technology Council. (2009). Quantification of building seismic performance
factors. US Department of Homeland Security, FEMA.
Aquib, A. T., & Mai, P. M. (2024, 09). Broadband Ground-Motion Simulations
with Machine-Learning-Based High-Frequency Waves from Fourier Neural
Operators. Bulletin of the Seismological Society of America.
Arias, A. (1970). A measure of earthquake intensity. Seismic design for nuclear
plants, 438–483.
Baker, J., & Allin Cornell, C. (2006). Spectral shape, epsilon and record selection.
Earthquake Engineering & Structural Dynamics, 35 (9), 1077–1095.
Baker, J., & Bradley, B. (2017). Intensity measure correlations observed in the
nga-west2 database, and dependence of correlations on rupture and site param-
eters. Earthquake Spectra, 33 (1), 145–156.
Baker, J., Bradley, B., & Stafford, P. (2021). Seismic hazard and risk analysis. Cam-
bridge University Press.
Bayless, J., & Abrahamson, N. A. (2019). An empirical model for the interfrequency
correlation of epsilon for Fourier amplitude spectra. Bulletin of the Seismologi-
cal Society of America, 109 (3), 1058–1070.
Bergamo, P., Hammer, C., & Fäh, D. (2019). SERA WP7/NA5 - Deliverable 7.4:
Towards improvement of site condition indicators (Report). Zurich: ETH
Zurich. doi: 10.3929/ethz-b-000467564
Bergmeister, A., Palgunadi, K. H., Bosisio, A., Ermert, L., Koroni, M., Perraudin,
N., . . . Meier, M.-A. (2024). Software package ”tqdne” for paper titled ”High
Resolution Seismic Waveform Generation using Denoising Diffusion”. Zenodo.
doi: 10.5281/zenodo.13952381
–30–
manuscript submitted to JGR: Machine Learning and Computation
Bommer, J., & Acevedo, A. (2004). The use of real earthquake accelerograms as
input to dynamic analysis. Journal of Earthquake Engineering, 8 (spec01), 43–
91.
Boore, D. M. (2003). Simulation of ground motion using the stochastic method.
Pure and applied geophysics, 160 , 635–676.
Boore, D. M., & Joyner, W. B. (1997). Site amplifications for generic rock sites. Bul-
letin of the seismological society of America, 87 (2), 327–341.
Boore, D. M., Stewart, J. P., Seyhan, E., & Atkinson, G. M. (2014). Nga-west2
equations for predicting pga, pgv, and 5% damped psa for shallow crustal
earthquakes. Earthquake Spectra, 30 (3), 1057–1085.
Boore, D. M., Watson-Lamprey, J., & Abrahamson, N. A. (2006). Orientation-
independent measures of ground motion. Bulletin of the seismological Society
of America, 96 (4A), 1502–1511.
Chopra, A. K. (2007). Dynamics of structures. Pearson Education India.
Défossez, A., Copet, J., Synnaeve, G., & Adi, Y. (2023). High fidelity neural audio
compression. Transactions on Machine Learning Research.
Derras, B., Bard, P.-Y., Cotton, F., & Bekkouche, A. (2012). Adapting the neural
network approach to pga prediction: An example based on the kik-net data.
Bulletin of the Seismological Society of America, 102 (4), 1446–1461.
Dhariwal, P., & Nichol, A. (2021). Diffusion models beat gans on image synthesis.
Advances in neural information processing systems, 34 , 8780–8794.
Douglas, J. (2003). Earthquake ground motion estimation using strong-motion
records: a review of equations for the estimation of peak ground acceleration
and response spectral ordinates. Earth-Science Reviews, 61 (1-2), 43–104.
Douglas, J., & Aochi, H. (2008). A survey of techniques for predicting earthquake
ground motions for engineering purposes. Surveys in geophysics, 29 , 187–220.
Esfahani, R. D., Cotton, F., Ohrnberger, M., & Scherbaum, F. (2023). TFCGAN:
Nonstationary ground-motion simulation in the time-frequency domain using
conditional generative adversarial network (cgan) and phase retrieval methods.
Bulletin of the Seismological Society of America, 113 (1), 453–467.
Esfahani, R. D., Vogel, K., Cotton, F., Ohrnberger, M., Scherbaum, F., &
Kriegerowski, M. (2021). Exploring the dimensionality of ground-motion
data by applying autoencoder techniques. Bulletin of the Seismological Society
of America, 111 (3), 1563–1576.
Florez, M. A., Caporale, M., Buabthong, P., Ross, Z. E., Asimaki, D., & Meier, M.-
A. (2022). Data-driven synthesis of broadband earthquake ground motions
using artificial intelligence. Bulletin of the Seismological Society of America,
112 (4), 1979–1996.
Gatti, F., & Clouteau, D. (2020). Towards blending physics-based numerical sim-
ulations and seismic databases using generative adversarial network. Computer
Methods in Applied Mechanics and Engineering, 372 , 113421.
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S.,
. . . Bengio, Y. (2014). Generative adversarial nets. In Advances in neural
information processing systems.
Graves, R., & Pitarka, A. (2010). Broadband ground-motion simulation using a
hybrid approach. Bull Seismol Soc Am, 100 (5A), 2095–2123. doi: 10.1785/
0120100057
Griffin, D., & Lim, J. (1984). Signal estimation from modified short-time Fourier
transform. IEEE Transactions on Acoustics, Speech, and Signal Processing,
32 (2), 236–243.
Hartzell, S., Harmsen, S., Frankel, A., & Larsen, S. (1999). Calculation of broad-
band time histories of ground motion: Comparison of methods and validation
using strong-ground motion from the 1994 northridge earthquake. Bulletin of
the Seismological Society of America, 89 (6), 1484–1504.
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., & Hochreiter, S. (2017).
–31–
manuscript submitted to JGR: Machine Learning and Computation
GANs trained by a two time-scale update rule converge to a local Nash equi-
librium. In Advances in neural information processing systems.
Ho, J., Jain, A., & Abbeel, P. (2020). Denoising diffusion probabilistic models. In
Advances in neural information processing systems.
Ho, J., Salimans, T., Gritsenko, A., Chan, W., Norouzi, M., & Fleet, D. J. (2022).
Video diffusion models. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave,
K. Cho, & A. Oh (Eds.), Advances in neural information processing systems
(Vol. 35, pp. 8633–8646). Curran Associates, Inc.
Jayalakshmi, S., Dhanya, J., Raghukanth, S., & Mai, P. M. (2021). Hybrid broad-
band ground motion simulations in the indo-gangetic basin for great himalayan
earthquake scenarios. Bulletin of Earthquake Engineering, 19 , 3319–3348.
Jozinović, D., Lomax, A., Štajduhar, I., & Michelini, A. (2022). Transfer learning:
Improving neural network based prediction of earthquake ground shaking for
an area with insufficient training data. Geophysical Journal International ,
229 (1), 704–718.
Kanno, T., Narita, A., Morikawa, N., Fujiwara, H., & Fukushima, Y. (2006). A new
attenuation relation for strong ground motion in japan based on recorded data.
Bulletin of the Seismological Society of America, 96 (3), 879–897.
Karras, T., Aittala, M., Aila, T., & Laine, S. (2022). Elucidating the design space of
diffusion-based generative models. In Advances in neural information process-
ing systems.
Karras, T., Laine, S., & Aila, T. (2019). A style-based generator architecture for
generative adversarial networks. In Proceedings of the ieee/cvf conference on
computer vision and pattern recognition (pp. 4401–4410).
Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., & Aila, T. (2020).
Analyzing and improving the image quality of stylegan. In Proceedings of
the ieee/cvf conference on computer vision and pattern recognition (pp. 8110–
8119).
Katsanos, E., Sextos, A., & Manolis, G. (2010). Selection of earthquake ground
motion records: A state-of-the-art review from a structural engineering per-
spective. Soil dynamics and earthquake engineering, 30 (4), 157–169.
Kingma, D. P., & Welling, M. (2014). Auto-Encoding Variational Bayes. In Interna-
tional conference on learning representations.
Kong, Z., Ping, W., Huang, J., Zhao, K., & Catanzaro, B. (2021). DiffWave: A ver-
satile diffusion model for audio synthesis. In International conference on learn-
ing representations.
Li, Y., Ku, B., Zhang, S., Ahn, J.-K., & Ko, H. (2020). Seismic data augmentation
based on conditional generative adversarial networks. Sensors, 20 (23), 6850.
Li, Z., Meier, M.-A., Hauksson, E., Zhan, Z., & Andrews, J. (2018). Machine learn-
ing seismic wave discrimination: Application to earthquake early warning.
Geophysical Research Letters, 45 (10), 4773–4779.
Lilienkamp, H., von Specht, S., Weatherill, G., Caire, G., & Cotton, F. (2022).
Ground-motion modeling as an image processing task: Introducing a neural
network based, fully data-driven, and nonergodic approach. Bulletin of the
Seismological Society of America, 112 (3), 1565–1582.
Luco, N., & Bazzurro, P. (2007). Does amplitude scaling of ground motion records
result in biased nonlinear structural drift responses? Earthquake Engineering
& Structural Dynamics, 36 (13), 1813–1835.
Maechling, P. J., Silva, F., Callaghan, S., & Jordan, T. H. (2015). Scec broadband
platform: System architecture and software implementation. Seismological Re-
search Letters, 86 (1), 27–38.
Mai, P. M., & Beroza, G. (2002). A spatial random field model to characterize
complexity in earthquake slip. Journal of Geophysical Research: Solid Earth,
107 (B11), ESE–10.
Mai, P. M., & Beroza, G. (2003). A hybrid method for calculating near-source,
–32–
manuscript submitted to JGR: Machine Learning and Computation
–33–
manuscript submitted to JGR: Machine Learning and Computation
Saikia, C. K., & Somerville, P. (1997). Simulated hard-rock motions in saint louis,
missouri, from large new madrid earthquakes (mw≥ 6.5). Bulletin of the Seis-
mological Society of America, 87 (1), 123–139.
Savran, W., & Olsen, K. (2019). Ground motion simulation and validation of the
2008 chino hills earthquake in scattering media. Geophysical Journal Interna-
tional , 219 (3), 1836–1850.
Shi, Y., Lavrentiadis, G., Asimaki, D., Ross, Z. E., & Azizzadenesheli, K. (2024).
Broadband ground-motion synthesis via generative adversarial neural oper-
ators: Development and validation. Bulletin of the Seismological Society of
America, 114 (4), 2151–2171.
Smerzini, C., Amendola, C., Paolucci, R., & Bazrafshan, A. (2024, February).
Engineering validation of BB-SPEEDset, a data set of near-source physics-
based simulated accelerograms. Earthquake Spectra, 40 (1), 420–445. Re-
trieved 2024-10-15, from http://journals.sagepub.com/doi/10.1177/
87552930231206766 doi: 10.1177/87552930231206766
Song, Y., & Ermon, S. (2019). Generative modeling by estimating gradients of the
data distribution. In Advances in neural information processing systems.
Song, Y., Sohl-Dickstein, J., Kingma, D. P., Kumar, A., Ermon, S., & Poole, B.
(2021). Score-based generative modeling through stochastic differential equa-
tions. In International conference on learning representations.
Tang, Y., & Mai, P. M. (2023). Stochastic ground-motion simulation of the 2021 m
w 5.9 woods point earthquake: Facilitating local probabilistic seismic hazard
analysis in australia. Bulletin of the Seismological Society of America, 113 (5),
2119–2143.
thispersondoesnotexist.com. (2023). This person does not exist. https://
thispersondoesnotexist.com. (Accessed: 2024-10-16)
Touhami, S., Gatti, F., Lopez-Caballero, F., Cottereau, R., de Abreu Corrêa, L.,
Aubry, L., & Clouteau, D. (2022). Sem3d: A 3d high-fidelity numerical
earthquake simulator for broadband (0–10 hz) seismic response prediction at a
regional scale. Geosciences, 12 (3), 112.
van Ede, M. C., Molinari, I., Imperatori, W., Kissling, E., Baron, J., & Morelli, A.
(2020). Hybrid broadband seismograms for seismic shaking scenarios: An ap-
plication to the po plain sedimentary basin (northern italy). Pure and Applied
Geophysics, 177 (5), 2181–2198.
Vincent, P. (2011). A connection between score matching and denoising autoen-
coders. Neural Computation, 23 (7), 1661–1674.
Wang, T., Trugman, D., & Lin, Y. (2021). Seismogen: Seismic waveform synthesis
using gan with application to seismic data augmentation. Journal of Geophysi-
cal Research: Solid Earth, 126 (4), e2020JB020077.
Woollam, J., Münchmeyer, J., Tilmann, F., Rietbrock, A., Lange, D., Bornstein, T.,
. . . others (2022). Seisbench—a toolbox for machine learning in seismology.
Seismological Society of America, 93 (3), 1695–1709.
–34–
manuscript submitted to JGR: Machine Learning and Computation
1. Figures S1 to S42
Introduction
This supplementary material provides additional information and figures to com-
plement the main content of the primary text. The aim is to offer a deeper understand-
ing and further validate the presented results. This document includes visual represen-
tations that support and enhance the findings discussed in the main text.
Additional figures for evaluation metrics for the generative waveform model (GWM)
and the real data for different bins of magnitudes, hypocentral distances, faulting type,
and VS30 :
Figures:
40 60 80 120 150 200
35000
17108 13715 6524 2921 1263 291
30000
25902 16449 6887 2475 1050 200
25000
Distance bin [km]
Figure S1: Number of samples in each magnitude-distance bin for all of the following bin
plots. Predicted and target denote the generative waveform model (GWM) and real data.
–35–
manuscript submitted to JGR: Machine Learning and Computation
Predicted Target
0.0 0.0
2.5 2.5
Log-Amplitude [m/s2]
Log-Amplitude [m/s2]
5.0 5.0
0-40 km
7.5 7.5
10.0 10.0
12.5 12.5
15.0 15.0
0 10 20 30 40 0 10 20 30 40
Time [s] Time [s]
0.0 0.0
2.5 2.5
Log-Amplitude [m/s2]
Log-Amplitude [m/s2]
5.0 5.0
40-60 km
7.5 7.5
10.0 10.0
12.5 12.5
15.0 15.0
0 10 20 30 40 0 10 20 30 40
Time [s] Time [s]
0.0 0.0
2.5 2.5
Log-Amplitude [m/s2]
Log-Amplitude [m/s2]
5.0 5.0
60-80 km
7.5 7.5
10.0 10.0
12.5 12.5
15.0 15.0
0 10 20 30 40 0 10 20 30 40
Time [s] Time [s]
0.0 0.0
2.5 2.5
Log-Amplitude [m/s2]
Log-Amplitude [m/s2]
80-120 km
5.0 5.0
7.5 7.5
10.0 10.0
12.5 12.5
15.0 15.0
0 10 20 30 40 0 10 20 30 40
Time [s] Time [s]
0.0 0.0
2.5 2.5
Log-Amplitude [m/s2]
Log-Amplitude [m/s2]
120-150 km
5.0 5.0
7.5 7.5
10.0 10.0
12.5 12.5
15.0 15.0
0 10 20 30 40 0 10 20 30 40
Time [s] Time [s]
0.0 0.0
2.5 2.5
Log-Amplitude [m/s2]
Log-Amplitude [m/s2]
150-200 km
5.0 5.0
7.5 7.5
10.0 10.0
12.5 12.5
15.0 15.0
0 10 20 30 40 0 10 20 30 40
Time [s] Time [s]
Magnitude bins
4.5-5 5-5.5 5.5-6 6-6.5 6.5-7 7-7.5
–36–
manuscript submitted to JGR: Machine Learning and Computation
Predicted Target
0.0 0.0
2.5 2.5
Log-Amplitude [m/s2]
Log-Amplitude [m/s2]
5.0 5.0
0-40 km
7.5 7.5
10.0 10.0
12.5 12.5
15.0 15.0
0 10 20 30 40 0 10 20 30 40
Time [s] Time [s]
0.0 0.0
2.5 2.5
Log-Amplitude [m/s2]
Log-Amplitude [m/s2]
5.0 5.0
40-60 km
7.5 7.5
10.0 10.0
12.5 12.5
15.0 15.0
0 10 20 30 40 0 10 20 30 40
Time [s] Time [s]
0.0 0.0
2.5 2.5
Log-Amplitude [m/s2]
Log-Amplitude [m/s2]
5.0 5.0
60-80 km
7.5 7.5
10.0 10.0
12.5 12.5
15.0 15.0
0 10 20 30 40 0 10 20 30 40
Time [s] Time [s]
0.0 0.0
2.5 2.5
Log-Amplitude [m/s2]
Log-Amplitude [m/s2]
80-120 km
5.0 5.0
7.5 7.5
10.0 10.0
12.5 12.5
15.0 15.0
0 10 20 30 40 0 10 20 30 40
Time [s] Time [s]
0.0 0.0
2.5 2.5
Log-Amplitude [m/s2]
Log-Amplitude [m/s2]
120-150 km
5.0 5.0
7.5 7.5
10.0 10.0
12.5 12.5
15.0 15.0
0 10 20 30 40 0 10 20 30 40
Time [s] Time [s]
0.0 0.0
2.5 2.5
Log-Amplitude [m/s2]
Log-Amplitude [m/s2]
150-200 km
5.0 5.0
7.5 7.5
10.0 10.0
12.5 12.5
15.0 15.0
0 10 20 30 40 0 10 20 30 40
Time [s] Time [s]
Magnitude bins
4.5-5 5-5.5 5.5-6 6-6.5 6.5-7 7-7.5
–37–
manuscript submitted to JGR: Machine Learning and Computation
Predicted Target
0 0
2 2
Log-Amplitude [m/s2]
Log-Amplitude [m/s2]
4 4
0-40 km
6 6
8 8
10 10
12 12
14 14
0 10 20 30 40 0 10 20 30 40
Time [s] Time [s]
0 0
2 2
Log-Amplitude [m/s2]
Log-Amplitude [m/s2]
4 4
40-60 km
6 6
8 8
10 10
12 12
14 14
0 10 20 30 40 0 10 20 30 40
Time [s] Time [s]
0 0
2 2
Log-Amplitude [m/s2]
4 Log-Amplitude [m/s2] 4
60-80 km
6 6
8 8
10 10
12 12
14 14
0 10 20 30 40 0 10 20 30 40
Time [s] Time [s]
0 0
2 2
Log-Amplitude [m/s2]
Log-Amplitude [m/s2]
4 4
80-120 km
6 6
8 8
10 10
12 12
14 14
0 10 20 30 40 0 10 20 30 40
Time [s] Time [s]
0 0
2 2
Log-Amplitude [m/s2]
Log-Amplitude [m/s2]
4 4
120-150 km
6 6
8 8
10 10
12 12
14 14
0 10 20 30 40 0 10 20 30 40
Time [s] Time [s]
0 0
2 2
Log-Amplitude [m/s2]
Log-Amplitude [m/s2]
4 4
150-200 km
6 6
8 8
10 10
12 12
14 14
0 10 20 30 40 0 10 20 30 40
Time [s] Time [s]
Magnitude bins
4.5-5 5-5.5 5.5-6 6-6.5 6.5-7 7-7.5
–38–
manuscript submitted to JGR: Machine Learning and Computation
Predicted Target
5.0 5.0
Log-Amplitude [m/s2 Hz 1]
Log-Amplitude [m/s2 Hz 1]
2.5 2.5
0.0 0.0
0-40 km
2.5 2.5
5.0 5.0
7.5 7.5
10.0 10.0
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
5.0 5.0
Log-Amplitude [m/s2 Hz 1]
Log-Amplitude [m/s2 Hz 1]
2.5 2.5
0.0 0.0
40-60 km
2.5 2.5
5.0 5.0
7.5 7.5
10.0 10.0
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
5.0 5.0
Log-Amplitude [m/s2 Hz 1]
Log-Amplitude [m/s2 Hz 1]
2.5 2.5
0.0 0.0
60-80 km
2.5 2.5
5.0 5.0
7.5 7.5
10.0 10.0
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
5.0 5.0
Log-Amplitude [m/s2 Hz 1]
Log-Amplitude [m/s2 Hz 1]
2.5 2.5
80-120 km
0.0 0.0
2.5 2.5
5.0 5.0
7.5 7.5
10.0 10.0
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
5.0 5.0
Log-Amplitude [m/s2 Hz 1]
Log-Amplitude [m/s2 Hz 1]
2.5 2.5
120-150 km
0.0 0.0
2.5 2.5
5.0 5.0
7.5 7.5
10.0 10.0
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
5.0 5.0
Log-Amplitude [m/s2 Hz 1]
Log-Amplitude [m/s2 Hz 1]
2.5 2.5
150-200 km
0.0 0.0
2.5 2.5
5.0 5.0
7.5 7.5
10.0 10.0
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
Magnitude bins
4.5-5 5-5.5 5.5-6 6-6.5 6.5-7 7-7.5
–39–
manuscript submitted to JGR: Machine Learning and Computation
Predicted Target
5.0 5.0
Log-Amplitude [m/s2 Hz 1]
Log-Amplitude [m/s2 Hz 1]
2.5 2.5
0.0 0.0
0-40 km
2.5 2.5
5.0 5.0
7.5 7.5
10.0 10.0
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
5.0 5.0
Log-Amplitude [m/s2 Hz 1]
Log-Amplitude [m/s2 Hz 1]
2.5 2.5
0.0 0.0
40-60 km
2.5 2.5
5.0 5.0
7.5 7.5
10.0 10.0
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
5.0 5.0
Log-Amplitude [m/s2 Hz 1]
Log-Amplitude [m/s2 Hz 1]
2.5 2.5
0.0 0.0
60-80 km
2.5 2.5
5.0 5.0
7.5 7.5
10.0 10.0
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
5.0 5.0
Log-Amplitude [m/s2 Hz 1]
Log-Amplitude [m/s2 Hz 1]
2.5 2.5
80-120 km
0.0 0.0
2.5 2.5
5.0 5.0
7.5 7.5
10.0 10.0
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
5.0 5.0
Log-Amplitude [m/s2 Hz 1]
Log-Amplitude [m/s2 Hz 1]
2.5 2.5
120-150 km
0.0 0.0
2.5 2.5
5.0 5.0
7.5 7.5
10.0 10.0
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
5.0 5.0
Log-Amplitude [m/s2 Hz 1]
Log-Amplitude [m/s2 Hz 1]
2.5 2.5
150-200 km
0.0 0.0
2.5 2.5
5.0 5.0
7.5 7.5
10.0 10.0
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
Magnitude bins
4.5-5 5-5.5 5.5-6 6-6.5 6.5-7 7-7.5
–40–
manuscript submitted to JGR: Machine Learning and Computation
Predicted Target
5.0 5.0
Log-Amplitude [m/s2 Hz 1]
Log-Amplitude [m/s2 Hz 1]
2.5 2.5
0.0 0.0
0-40 km
2.5 2.5
5.0 5.0
7.5 7.5
10.0 10.0
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
5.0 5.0
Log-Amplitude [m/s2 Hz 1]
Log-Amplitude [m/s2 Hz 1]
2.5 2.5
0.0 0.0
40-60 km
2.5 2.5
5.0 5.0
7.5 7.5
10.0 10.0
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
5.0 5.0
Log-Amplitude [m/s2 Hz 1]
Log-Amplitude [m/s2 Hz 1]
2.5 2.5
0.0 0.0
60-80 km
2.5 2.5
5.0 5.0
7.5 7.5
10.0 10.0
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
5.0 5.0
Log-Amplitude [m/s2 Hz 1]
Log-Amplitude [m/s2 Hz 1]
2.5 2.5
0.0 0.0
80-120 km
2.5 2.5
5.0 5.0
7.5 7.5
10.0 10.0
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
5.0 5.0
Log-Amplitude [m/s2 Hz 1]
Log-Amplitude [m/s2 Hz 1]
2.5 2.5
0.0 0.0
120-150 km
2.5 2.5
5.0 5.0
7.5 7.5
10.0 10.0
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
5.0 5.0
Log-Amplitude [m/s2 Hz 1]
Log-Amplitude [m/s2 Hz 1]
2.5 2.5
0.0 0.0
150-200 km
2.5 2.5
5.0 5.0
7.5 7.5
10.0 10.0
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
Magnitude bins
4.5-5 5-5.5 5.5-6 6-6.5 6.5-7 7-7.5
–41–
manuscript submitted to JGR: Machine Learning and Computation
Figure S8: Shaking duration for all GWM synthetics (red triangles) using one realization
and real data (grey circles) with corresponding conditioning parameters. For each magni-
tude bin (every 0.08) from M 4.5 - 9.0, grey dots and lines show the mean and standard
deviation of the real data, while blue triangles and lines show the mean and standard de-
viation of the GWM synthetics.
Figure S9: RotD50 pseudo-spectral acceleration (SA) with a damping factor 5% versus
hypocentral distance for periods (T ) of 0.1 s and 1.0 s. Median prediction of the GWM
(black line) and standard deviation (yellow shaded area), along with median prediction
(solid lines) and standard deviation (dashed lines) of the Boore et al. (2014) GMM (vio-
let), and the Kanno et al. (2006) GMM (red), using M5 and VS30 = 150 m/s. The data
are sampled from narrow magnitude and VS30 bins, as written in the figure titles, and
shown by their median (green squares) and standard deviations (green lines).
–42–
manuscript submitted to JGR: Machine Learning and Computation
Figure S10: Same as Figure S9 but for the magnitude bin M5 and VS30 = 600 m/s.
Figure S11: Same as Figure S9 but for the magnitude bin M5.5 and VS30 = 150 m/s.
–43–
manuscript submitted to JGR: Machine Learning and Computation
Figure S12: Same as Figure S9 but for the magnitude bin M5.5 and VS30 = 600 m/s.
Figure S13: Same as Figure S9 but for the magnitude bin M6.0 and VS30 = 150 m/s.
–44–
manuscript submitted to JGR: Machine Learning and Computation
Figure S14: Same as Figure S9 but for the magnitude bin M6.0 and VS30 = 600 m/s.
Figure S15: Same as Figure S9 but for the magnitude bin M6.5 and VS30 = 150 m/s.
–45–
manuscript submitted to JGR: Machine Learning and Computation
Figure S16: Same as Figure S9 but for the magnitude bin M6.5 and VS30 = 600 m/s.
Figure S17: Same as Figure S9 but for the magnitude bin M7.0 and VS30 = 150 m/s.
–46–
manuscript submitted to JGR: Machine Learning and Computation
Figure S18: Same as Figure S9 but for the magnitude bin M7.0 and VS30 = 600 m/s.
Figure S19: RotD50 pseudo-spectral acceleration (SA) with a damping factor of 5% ver-
sus magnitude. Median of the GWM prediction (black lines) and its standard deviation
(yellow shaded areas), using R = 15 km. Panels a) and c) show RotD50 pseudo-spectral
acceleration for VS30 = 150 m/s at periods of 0.1 s and 1.0 s, respectively. Panels b) and
d) show RotD50 pseudo-spectral acceleration for VS30 = 600 m/s at periods of 0.1 s and
1.0 s, respectively. The data (grey dots) are sampled from narrow magnitude, R, and VS30
bins, as written in the figure titles.
–47–
manuscript submitted to JGR: Machine Learning and Computation
–48–
manuscript submitted to JGR: Machine Learning and Computation
–49–
manuscript submitted to JGR: Machine Learning and Computation
Figure S25: RotD50 pseudo-spectral acceleration (SA) with a damping factor of 5% ver-
sus magnitude. Median of the GWM prediction (black lines) and its standard deviation
(yellow shaded areas), using R = 15 km. Panels a) and d) show RotD50 pseudo-spectral
acceleration for M 5 m/s at periods of 0.1 s and 1.0 s, respectively. Panels b) and e)
show RotD50 pseudo-spectral acceleration for M 5.5 m/s at periods of 0.1 s and 1.0 s,
respectively. Panels c) and f) show RotD50 pseudo-spectral acceleration for M 6.0 m/s
at periods of 0.1 s and 1.0 s, respectively. The data (grey dots) are sampled from narrow
magnitude, R, and magnitude bins, as written in the figure titles.
–50–
manuscript submitted to JGR: Machine Learning and Computation
Figure S26: Same as Figure S25 but with a distance bin of 40 km.
Figure S27: Same as Figure S25 but with a distance bin of 60 km.
Figure S28: Same as Figure S25 but with a distance bin of 80 km.
–51–
manuscript submitted to JGR: Machine Learning and Computation
Figure S29: Same as Figure S25 but with a distance bin of 100 km.
Figure S30: Same as Figure S25 but with a distance bin of 130 km.
–52–
manuscript submitted to JGR: Machine Learning and Computation
Figure S31: Average model probabilities given the SA data of the ground motion model
(GMM) by (Boore et al., 2014), generative waveform modeling (GWM), and the ratio
between the two distributions given the data as a function of magnitude and recording
distance for VS30 = 240 m/s. Panels a), b), and c) show the model likelihoods and their
ratios at T = 0.1 s, T = 0.3 s, and T = 1.0 s, respectively.
Figure S32: Same as Figure S31 but for GMM model of (Kanno et al., 2006) for VS30 =
240 m/s. Panels a), b), and c) show the model likelihoods and their ratios at T = 0.1 s,
T = 0.3 s, and T = 1.0 s, respectively.
–53–
manuscript submitted to JGR: Machine Learning and Computation
Figure S33: Same as Figure S31 but for VS30 = 520 m/s.
Figure S34: Same as Figure S32 but for VS30 = 520 m/s.
–54–
manuscript submitted to JGR: Machine Learning and Computation
Figure S35: Same as Figure S31 but for VS30 = 800 m/s.
Figure S36: Same as Figure S32 but for VS30 = 800 m/s.
–55–
manuscript submitted to JGR: Machine Learning and Computation
Figure S37: Same as Figure S31 but for VS30 = 1080 m/s.
Figure S38: Same as Figure S32 but for VS30 = 1080 m/s.
–56–
manuscript submitted to JGR: Machine Learning and Computation
Figure S39: Same as Figure S31 but for VS30 = 1360 m/s.
Figure S40: Same as Figure S32 but for VS30 = 1360 m/s.
–57–
manuscript submitted to JGR: Machine Learning and Computation
Figure S41: Residual of the spectral mean amplitude between real data and GWM for all
records.
75 100 125 150 200
4.5 4.75 5 5.25 6 9.1 4.5 4.75 5 5.25 6 9.1 4.5 4.75 5 5.25 6 9.1
Magnitude bin Magnitude bin Magnitude bin
Figure S42: Log-amplitude Fourier spectra Fréchet Distance heatmaps for all three com-
ponents in different magnitude and distance bins.
–58–