0% found this document useful (0 votes)
20 views58 pages

High Resolution Seismic Waveform Generation Using Denoising Diffusion

Uploaded by

pkia.tgf.upnyk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views58 pages

High Resolution Seismic Waveform Generation Using Denoising Diffusion

Uploaded by

pkia.tgf.upnyk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 58

manuscript submitted to JGR: Machine Learning and Computation

High Resolution Seismic Waveform Generation using


Denoising Diffusion

Andreas Bergmeister1,†,∗ , Kadek Hendrawan Palgunadi2,∗ , Andrea Bosisio3 ,


Laura Ermert2 , Maria Koroni2 , Nathanaël Perraudin1,† , Simon Dirmeier1 ,
Men-Andrin Meier4

1 Swiss Data Science Center (SDSC), ETH Zürich, Switzerland


2 Swiss Seismological Service (SED), ETH Zürich, Switzerland
arXiv:2410.19343v1 [physics.geo-ph] 25 Oct 2024

3 Politecnico di Milano, Italy


4 Earth and Planetary Science Department, ETH Zürich, Switzerland
† Work conducted while being employed at SDSC
∗ equal contributions

This manuscript is an arXiv preprint that has been submitted to a peer-


reviewed journal and has not yet undergone peer review.

Key Points:
• A novel generative latent denoising diffusion model generates realistic synthetic
seismic waveforms with frequency content up to 50 Hz.
• The model predicts peak amplitudes at least as accurately as local ground mo-
tion models, and with the same variability as in real data.
• We introduce an open-souce Python library for using the pre-trained model, and
to train new generative models.

Corresponding author: Men-Andrin Meier, [email protected]

–1–
manuscript submitted to JGR: Machine Learning and Computation

Abstract
Accurate prediction and synthesis of seismic waveforms are crucial for seismic hazard as-
sessment and earthquake-resistant infrastructure design. Existing prediction methods,
such as Ground Motion Models and physics-based simulations, often fail to capture the
full complexity of seismic wavefields, particularly at higher frequencies. This study in-
troduces a novel, efficient, and scalable generative model for high-frequency seismic wave-
form generation. Our approach leverages a spectrogram representation of seismic wave-
form data, which is reduced to a lower-dimensional submanifold via an autoencoder. A
state-of-the-art diffusion model is trained to generate this latent representation, condi-
tioned on key input parameters: earthquake magnitude, recording distance, site condi-
tions, and faulting type. The model generates waveforms with frequency content up to
50 Hz. Any scalar ground motion statistic, such as peak ground motion amplitudes and
spectral accelerations, can be readily derived from the synthesized waveforms. We val-
idate our model using commonly used seismological metrics, and performance metrics
from image generation studies. Our results demonstrate that our openly available model
can generate distributions of realistic high-frequency seismic waveforms across a wide
range of input parameters, even in data-sparse regions. For the scalar ground motion statis-
tics commonly used in seismic hazard and earthquake engineering studies, we show that
the model accurately reproduces both the median trends of the real data and its vari-
ability. To evaluate and compare the growing number of this and similar ’Generative Wave-
form Models’ (GWM), we argue that they should generally be openly available and that
they should be included in community efforts for ground motion model evaluations.

Plain Language Summary


Predicting how the ground shakes during an earthquake is crucial for understand-
ing earthquake hazards and for designing earthquake-resistant buildings. In this study,
we use a recently developed artificial intelligence (AI) method to generate realistic, syn-
thetic earthquake seismograms. After transforming the training seismograms from a time-
domain into a time-frequency representation, we use a special type of AI model called
a diffusion model—originally successful in generating images—to create synthetic seis-
mograms. Our model takes four input parameters (earthquake magnitude, recording dis-
tance, site condition, and faulting type) and can produce any number of realistic syn-
thetic seismograms for these parameter choices, with high-frequency details up to 50 Hz.
Our study shows that the open-source model we present can create realistic seismograms
in a wide variety of settings, even in areas with limited training data. For a suite of per-
formance metrics commonly used in assessing earthquake risk and for designing safe build-
ings, the model closely matches the average trends and variations shown in real earth-
quake records. To help evaluate and compare the increasing number of such Generative
Waveform Models (GWMs), we argue that such models should generally be made pub-
licly available and included in community efforts to assess ground motion prediction mod-
els.

1 Introduction
The study and prediction of earthquake ground motions are central to seismology.
Wavefield models across scales and frequencies are required to assess seismic hazard and
the response of critical infrastructure to ground motion. State-of-the-art seismic hazard
models use empirical ground motion models (GMMs) to estimate the expected level of
ground shaking (i.e., intensity measures) at a site given an earthquake and site proper-
ties information. Other applications require the prediction of full-time histories of ground
motion at sites of interest. An important example is nonlinear structural dynamic anal-
ysis and performance-based earthquake engineering (Chopra, 2007; Applied Technology
Council, 2009; Smerzini et al., 2024).

–2–
manuscript submitted to JGR: Machine Learning and Computation

Predicting ground motion is challenging, and existing methods for ground motion
synthesis have specific limitations. GMMs are empirical regression models that best fit
the observed data as functions of first- and second-order predictor variables, such as mag-
nitudes, recording distances, faulting mechanisms, and site conditions (Douglas, 2003;
Boore et al., 2014) and are often developed for specific regions or tectonic settings. They
reduce the full wavefield to scalar properties like peak amplitudes or spectral accelera-
tions, and are data-driven rather than physics-based, although the design of GMMs can
be guided by physical considerations, e.g. via the functional form of distance-attenuation
terms (Baker et al., 2021). While the different predicted variables in seismic waveforms
are inherently correlated and physically linked, traditional GMMs have often considered
them independently during the optimization of regression model parameters. Some more
recent models account for these correlations by employing multivariate statistical tech-
niques (Baker & Bradley, 2017; Baker et al., 2021).

State-of-the-art methods for nonlinear structural dynamic analysis and performance-


based earthquake engineering require full waveforms rather than just static scalar ground
motion features (Bommer & Acevedo, 2004; Luco & Bazzurro, 2007). Owing to the scarcity
of available short-distance records of large-magnitude events, a common practice is to
scale the amplitude spectra of existing strong-motion databases representing event mag-
nitude, source-to-site distance, and site conditions until they meet an expected target
spectrum (Baker & Allin Cornell, 2006; Katsanos et al., 2010; Aquib & Mai, 2024). These
records are often sampled from relatively limited amounts of data. Notably, it is not en-
tirely clear whether the scaling process leads to realistic waveforms and whether the true
ground motion variability is accurately represented in such scaled datasets.

Wavefields can also be modeled deterministically using physics-based numerical sim-


ulations of the wave equation. However, the accuracy and spatiotemporal variability of
the resulting ground motions are limited by the accuracy and resolution with which seis-
mic wave speeds and other relevant parameters are known. Together with high compu-
tational cost, this puts modeling the full range of hazard-relevant frequencies up to 10
Hz currently out of reach, except for applications of exceptionally well-studied regions
on high-performance computing systems (e.g., Rodgers et al. (2020); Paolucci, Mazzieri,
et al. (2021); Touhami et al. (2022); Palgunadi et al. (2024)). As a consequence, impor-
tant hazard-relevant wave propagation phenomena such as scattering, site amplification,
and sedimentary basin edge effects cannot be fully accounted for, peak accelerations may
be outside of the resolved frequency range, and dominant modes of structures, includ-
ing the high-frequency fundamental modes of low-rise buildings, may not be captured.
Computational cost and lack of detailed subsurface models also make it challenging to
meaningfully account for the uncertainties in path and site effects in physics-based sim-
ulations (e.g. Mai and Beroza (2002); Savran and Olsen (2019)).
Alternatively, broadband waveforms can be modeled stochastically (Boore, 2003).
The stochastic ground-motion simulation approach can generate ground motions for en-
gineering purposes across a range of earthquakes rather than focusing solely on a sin-
gle scenario. However, the limitations of this method include: (i) it is based on simpli-
fied statistical representations of source, path, and site effects, incorporating some phys-
ical concepts but lacking detailed physics of wave propagation and source mechanisms
which may result in less accurate synthesis of ground motions, especially for large-magnitude
earthquakes (Boore, 2003); (ii) it heavily depends on empirical parameters derived from
historical earthquake data (Boore & Joyner, 1997); (iii) it assumes the phase of the seis-
mic waves to be random (Boore, 2003; Graves & Pitarka, 2010); (iv) and it may under-
estimate the correlation between amplitudes at different frequencies (Bayless & Abra-
hamson, 2019).
Hybrid methods combine physics-based deterministic simulation at low frequen-
cies (< 1 Hz) with stochastic simulations at high frequencies (≥ 1 Hz) (e.g., Mai et al.
(2010); Graves and Pitarka (2010); Olsen and Takedatsu (2015); van Ede et al. (2020);

–3–
manuscript submitted to JGR: Machine Learning and Computation

Jayalakshmi et al. (2021)). Classic hybrid methods can suffer from parameterization is-
sues when extrapolating between models where the two signals are easily matched but
the phase spectrum is not (Mai & Beroza, 2003; Graves & Pitarka, 2010). Hybrid mod-
els provide the opportunity to generate earthquake scenarios for a limited number of rup-
tures in relatively well-studied regions, typically targeting large-magnitude events and
near-fault distances (e.g. Paolucci, Smerzini, and Vanini (2021)). However, they face sim-
ilar challenges to those inherent to deterministic simulation, namely high computational
cost and uncertainties in source characterization and seismic velocity models (Hartzell
et al., 1999; Douglas & Aochi, 2008).
Some of the current limitations can potentially be overcome with machine learn-
ing techniques. In earthquake engineering, machine learning models have been used for
the prediction of peak ground acceleration and response or Fourier spectra (e.g. Derras
et al. (2012); Esfahani et al. (2021); Jozinović et al. (2022); Lilienkamp et al. (2022)).
Alternatively, neural networks can be used to enhance simulated waveforms. For instance,
Paolucci et al. (2018) generated band-limited waveforms with physics-based earthquake
simulations and then trained an artificial neural network to predict and add higher fre-
quency content via their amplitude spectra. Similarly, Gatti and Clouteau (2020) used
generative adversarial networks (GAN) to extract high-frequency features from ground-
motion records and use them to enhance low-frequency physics-based simulated time se-
ries, while Aquib and Mai (2024) used a combination of GAN and Fourier Neural Op-
erator (FNO) for a similar purpose. GANs have also been used for seismic data augmen-
tation in earthquake detection problems (Y. Li et al., 2020; Wang et al., 2021), and to
train signal/noise discriminators for Earthquake Early Warning algorithms (Z. Li et al.,
2018).

Conditional generative models can be designed to synthesize three-component seis-


mic waveforms, conditional on ground motion predictor variables. Florez et al. (2022)
used a Wasserstein GAN to generate realistic broadband seismograms conditional on mag-
nitude (M 4.5–7.5), hypocentral distance (R = 0–180 km), and VS30 (0 - 1100 m/s). Among
other things, their study showed that the model accurately interpolates for conditional
parameter ranges where no data may be available, but is not quite accurate for very large
magnitude events (M > 8). Inspired by Florez et al. (2022), Esfahani et al. (2023) trained
a GAN to synthesize time-frequency representations of seismograms (TFCGAN) con-
sidering similar conditioning parameters (M 3.8–7.5, R < 120 km, VS30 < 1200 m/s).
Obtaining only the amplitude part of the time-frequency response from the GAN, they
utilized a phase retrieval algorithm to reconstruct the phase before transforming back
to time-domain seismograms. A study by Shi et al. (2024) developed a combination of
GAN and neural operators, presenting the conditional ground-motion synthesis algorithm
(cGM-GANO). Their algorithm is conditioned on moment magnitude (M 4.5–8.0), closest-
point-on-the-rupture to site distance (R = 0–300 km), style of faulting, VS30 (100–1100
m/s), and tectonic environment type (subduction or shallow crustal event). The most
recent study by Matsumoto et al. (2024) introduced additional input parameters focused
on site-specific conditions consisting of the shear-wave velocities in the top 5, 10, and 20
m and the depth to the layer with shear-wave velocities of 1000 and 1400 m/s. They also
used a modified GANs method based on pre-existing styleGAN by Karras et al. (2020).
The algorithm was trained with M > 5, rupture distance of ≤ 100 km, and hypocen-
tral depth of less than 30 km. For a detailed overview of the use of GAN for earthquake-
related engineering fields, the reader is referred to Marano et al. (2024). While these ad-
vancements show the potential of generative waveform modeling in seismology, GAN-
based models can suffer from mode collapse and other training instabilities (Goodfellow
et al., 2014), and neural operators require specialized neural architectures which are not
currently included in standard deep learning libraries like PyTorch.

In this study, we present a novel latent denoising diffusion model for the conditional
synthesis of seismic waveforms. The model uses an autoencoder to map spectrogram rep-

–4–
manuscript submitted to JGR: Machine Learning and Computation

resentations of observed ground motions into a lower-dimensional submanifold. We train


the diffusion model to generate these latent representations, rather than raw spectro- or
seismograms. Our model is capable of synthesizing realistic seismic waveforms, it is straight-
forward and computationally economical to train, and efficient during inference. We re-
lease the code and pre-trained models as part of the ’This-Quake-Does-Not-Exist’ open-
source Python library and hope to instigate a community effort on open generative seis-
mic waveform models.

The manuscript is structured as follows: in Section 2, we describe the model and


training process. Data descriptions and preprocessing steps are presented in Section 3.
In Section 4, we evaluate to what extent the model can generate realistic seismic wave-
forms, using metrics from both seismological and machine learning communities. In Sec-
tion 5, we discuss the potential of our and other ’Generative Waveform Models’ (GWM)
for practical and scientific applications.

2 Methods
Our approach to generating high-resolution seismic waveforms with a latent dif-
fusion model comprises three primary components. Initially, we transform the seismic
waveforms into spectrogram representations, which are more amenable to generative mod-
eling than time-domain signals. Subsequently, we employ a convolutional variational au-
toencoder to compress these high-dimensional spectrograms into a lower-dimensional la-
tent space. Third, we train a denoising diffusion model based on a U-Net architecture
to generate samples within this latent space, which are then mapped back to the spec-
trogram representation and converted to waveforms during inference. In this section we
provide a comprehensive description of each component of the generative pipeline (Fig-
ure 1). Detailed explanations of the neural network architectures, which we omit here
for brevity, are given in Appendix A1.

Figure 1: Generative pipeline: Waveforms are converted to spectrograms and an


autoencoder is trained to compress them. A conditional denoising diffusion model is
then trained to generate the autoencoder’s latent representations conditional on a low-
dimensional parameter vector. During inference, samples are drawn from the denoising
diffusion model for given conditioning parameters, mapped back to spectrograms using
the decoder, and finally converted to waveforms.

–5–
manuscript submitted to JGR: Machine Learning and Computation

2.1 Spectrogram Representation

Processing seismograms directly with neural networks is challenging due to their


high-frequency content and amplitude variance, both within and across samples. This
issue is particularly problematic in generative modeling, where high-amplitude areas dom-
inate the loss function, compromising reconstruction accuracy in low-amplitude regions.
To address these challenges, we transform the waveform into a spectrogram represen-
tation, a technique commonly used in seismology and in audio signal generation where
typically waveforms of much higher frequency content are modeled (Kong et al., 2021;
Défossez et al., 2023). Additional details on the spectrogram transformation and inver-
sion process are provided in Appendix A2.

2.2 Departure to Latent Space

The stochastic nature of the training objective in denoising diffusion models (see
Section 2.3), along with their iterative sampling process, makes both training and infer-
ence computationally intensive, particularly with high-dimensional data like high-frequency
seismic waveforms. To mitigate this, we adopt the two-stage approach of Rombach et
al. (2022). First, we train an autoencoder to compress the data into a more manageable,
lower-dimensional latent space, then use denoising diffusion to generate the latent vari-
ables. This combines the autoencoder’s data compression efficiency with the generative
capabilities of denoising diffusion models.
Formally, let pdata denote the data distribution density. We model the distribution
over latent variables z ∈ Rm as a mixture of Gaussians:

penc (z) = E penc (z|x) = E N (z | µϕ (x), diag(σϕ (x))). (1)


pdata (x) pdata (x)

Here, the encoder is defined by its mean and standard deviation functions µϕ , σϕ : Rn →
Rm , parameterized by a neural network with parameters ϕ. The network has two out-
put heads: one for the mean and one for the standard deviation. A deterministic decoder,
Gψ : Rm → Rn , maps latent variables back to the original data space. The encoder
and decoder are trained jointly to minimize the reconstruction loss
2
E ∥x − Gψ (z)∥ (2)
penc (z|x)

over the data distribution. Additionally, we regularize the latent space by the Kullback-
Leibler divergence between the encoder distribution and a standard normal distribution.
Since we employ the autoencoder solely for data compression rather than as a genera-
tive model, we set the regularization strength to a tiny value (1e−6) to ensure high re-
construction quality.

2.3 Denoising Diffusion


After compressing the data into a compact latent space, we employ denoising dif-
fusion models (DDMs; Song and Ermon (2019); Ho et al. (2020); Song et al. (2021)) to
generate the latent representations. DDMs have gained popularity due to their outstand-
ing performance in tasks such as image, audio, or video generation. Unlike previous tech-
niques, DDMs do not rely on approximate variational inference, which can produce blurry
samples as in variational autoencoders (Kingma & Welling, 2014), or adversarial train-
ing, which can be unstable and suffer from mode collapse (Goodfellow et al., 2014). Ad-
ditionally, they do not require restricted architectures like normalizing flows (Rezende
& Mohamed, 2015).

DDMs are characterized by a pair of stochastic processes: a fixed forward noising


process and a learnable backward denoising process. The forward process progressively
adds noise to the data distribution until it resembles an isotropic Gaussian. Conversely,

–6–
manuscript submitted to JGR: Machine Learning and Computation

the backward process commences from this noise distribution and aims to invert the ef-
fect of the forward process, recovering samples that approximately follow the data dis-
tribution. For conditional generation, both processes are modeled as conditional processes.
We parameterize the backward process by a neural network, which is trained to predict
the original sample, or equivalently the added noise, from the perturbed sample and the
conditioning parameters.
Specifically, our data distribution is a mixture of distributions conditioned on the
first-order properties of the waveforms (earthquake magnitude, hypocentral distance, site
conditions, and faulting type), represented as a vector c ∈ Rc :

pdata (x) = E pdata (x|c). (3)


pcond (c)

We focus on modeling the conditional distributions pdata (x|c), assuming the conditional
parameters are given at inference. For a fixed conditioning vector c, we obtain a latent
sample z ∼ Epdata (x|c) penc (z|x) by passing a sample from the conditional data distri-
bution pdata (x|c) through the stochastic encoder. We denote the corresponding noise-
perturbed samples at time 0 < t ≤ T as zt . These samples are obtained by evolving
z = z0 through the forward process described by the Itô stochastic differential equa-
tion
dzt = µt (zt ) dt + σt dw, (4)
where µt : Rm → Rm is a time-dependent drift function, σt > 0 is a time-varying dif-
fusion coefficient, and w denotes the standard Wiener process. For sufficiently large T ,
this process converges to a Gaussian distribution. Synthetic data can then be generated
by sampling from this Gaussian distribution and evolving the sample back to the latent
representation distribution using the backward process

dzt = [µt (zt ) − σt2 ∇zt log pt (zt |c)] dt + σt dw̄, (5)

where w̄ denotes the standard Wiener process with reversed time direction, and pt (zt |c)
is the density of the sample zt given the conditioning c. The score function ∇zt log pt (zt |c)
is generally intractable but can be approximated using denoising score matching loss (Vincent,
2011; Song et al., 2021).

Song et al. (2021) also identify a deterministic process with the same marginal dis-
tribution as the reverse process, characterized by the probability flow ordinary differen-
tial equation:
1
dzt = [µt (zt ) − σt2 ∇zt log pt (zt |c)] dt. (6)
2
In practice, integrating this deterministic process enables efficient sampling as simulat-
ing a stochastic process typically requires more time discretization steps.
For our GWM, we adopt the parameterizations proposed by Karras et al. (2022)
which excel in image generation tasks.
√ Specifically, we utilize their variance exploding
version by setting µt = 0 and σt = 2t. The forward and backward processes are then
given by
√ √
dzt = 2t dw, dzt = −2t∇zt log pt (zt |c) dt + 2t dw̄, (7)

respectively, and the probability flow ODE becomes

dzt = −t∇zt log pt (zt |c) dt. (8)

Integrating the forward process from time 0 to t yields the transition distribution pt (zt |z0 , c) =
pt (zt |z0 ) = N (zt | z0 , t2 I). We can sample from this distribution by adding Gaussian
noise with a variance of t2 to the original sample z0 . For normalized data, as T becomes

–7–
manuscript submitted to JGR: Machine Learning and Computation

sufficiently large, this distribution converges to an isotropic Gaussian with a variance of


T 2 – a simple starting distribution for the reverse process. Utilizing the relationship
z0 − zt
∇zt log pt (zt |z0 , c) =
t2
for the score of the transition distribution, we train a denoising model, denoted as Dθ :
Rd × [0, T ] × Rc → Rd , to estimate the original sample z0 from its perturbed version
zt given the conditioning c. The score matching objective with a time-dependent weight-
ing function λ : [0, T ] → R+ is then given by
2
L(θ) = E ∥Dθ (zt , t, c) − z0 ∥ , (9)
where the expectation is taken over the time t, the conditional distribution pcond (c), the
distribution over the encoded latent sample z0 and its perturbed version zt . When the
timestep t is sampled with positive probability for all t ∈ [0, T ], the optimal denoising
model Dθ∗ satisfies ∇zt log pt (zt |c) = (Dθ∗ (zt , t, c)−zt )/t2 almost surely for all t ∈ [0, T ]
(Vincent, 2011). We follow Karras et al. (2022) in parameterizing the loss weighting func-
tion and network preconditioning. To sample from the conditional diffusion model, we
utilize a second-order Heun method using 25 steps corresponding to a total of 50 model
evaluations.
We design model pipeline to synthesize three-component seismograms of 40 sec-
onds in length, with 100 samples per second. Each record has a corresponding 4-element
vector of meta-data, which we use as conditioning parameters. This includes event mag-
nitude, hypocentral distance, the VS30 site condition parameter, and faulting type.

3 Data and Model Training


We use three-component strong motion waveforms recorded from 1996 to 2022 by
K-NET and KiK-net stations in Japan, provided by the National Research Institute for
Earth Science and Disaster Resilience (NIED, 2019). Before training our model, we pre-
process the raw data by removing the scalar gain factor. We apply a causal 1 Hz But-
terworth high-pass filter of order 2 and resample the data by interpolation to a common
time vector to achieve a uniform sampling rate of 100 Hz. The waveforms are then trun-
cated to 40 seconds in length, with the P-wave arrival (P-pick onset) set to approximately
5 seconds after the starting time of the trimmed waveform. We consider shallow crustal
events with hypocentral depths of ≤ 25 km (classified as faulting type 1), and subduc-
tion events with depths > 25 km (classified as faulting type 0). We use all available events
with magnitudes M ≥ 4.5. The minimum and maximum station distances to the hypocen-
ter are 1 km and 180 km, respectively. VS30 is available for most station metadata; we
exclude all stations without VS30 values. The VS30 values range from 76 to 2100 m/s.
In total, we use 197,370 three-component records.
We divide the available data into 90% for training and 10% for testing to facilitate
hyperparameter tuning. Because of the limited number of observations—especially for
large-magnitude earthquakes and short-distance recordings—we use all available data
for model performance evaluation, except when noted otherwise. Training of our model
was conducted on a single NVIDIA A100 GPU, requiring approximately 38 hours for the
first-stage autoencoder and an additional 15 hours for the second-stage diffusion model.

4 Results
The design goal of the Generative Waveform Model (GWM) is to synthesize ground
motion records that are statistically indistinguishable from real records, across a wide
range of frequencies and conditioning parameters, namely source magnitudes, hypocen-
tral distances, VS30 , and faulting type. In the following section, we discuss the extent

–8–
manuscript submitted to JGR: Machine Learning and Computation

to which the Denoising Diffusion Model can achieve that goal and compare its outputs
(i) to real data and (ii) to commonly used Ground Motion Models (GMMs). First, we
compare the distributions of time-domain signal envelopes (Section 4.1) and of Fourier
Amplitude Spectra (Section 4.2) between the real data and the GWM synthetics. Next,
we evaluate how well the GWM predicts scalar ground motion intensity measures, in terms
of prediction accuracy (Section 4.3.1), and variability (Section 4.3.2). Then we compare
shaking durations of real data and GWM synthetics (Section 4.3.3), and the scaling be-
tween peak amplitude statistics and the conditioning predictor variables (Sections 4.3.5
- 4.3.7). Finally, we evaluate relative and absolute model performances for different mag-
nitude and hypocentral distance ranges, by computing average model probabilities (Sec-
tion 4.4), and performance metrics from the image generation domain (Section 4.5).
For each real seismogram, we use the trained model to produce a number of cor-
responding synthetic seismograms with the conditioning parameters of the real record,
i.e. their magnitude, hypocentral distance, VS30 value and faulting type. We can then
directly compare the real seismograms to their corresponding synthetic realizations (Fig-
ure 2). The GWM synthetics appear to capture the first-order characteristics of the real
seismograms: they have clear P- and S-phases with realistic phase arrival time differences,
as well as realistic coda wave decays. For the same conditioning parameter choices, there
is a considerable amount of variability between individual realizations, for example in
terms of peak amplitudes. In Section 4.3.2 we show that this variability closely matches
the variability observed in the real data.

Figure 2: Real three-component acceleration seismograms (grey) and 6 randomly selected


examples of GWM synthetics (red), for three sets of conditioning parameters: magnitude,
hypocentral distance, VS30 , and faulting type values.

–9–
manuscript submitted to JGR: Machine Learning and Computation

4.1 Time domain signal envelopes

To compare the real and synthetic waveforms quantitatively, we compute signal en-
velope time series for both sets. The signal envelopes are obtained by taking the mov-
ing average of the absolute waveform signals with a kernel size of 128, followed by a log-
arithmic transformation. This comparison shows that the GWM synthetics have very
similar first-order shapes in the time domain compared to the real seismograms, across
the entire range of magnitudes and recording distances for which the model was trained
(Figures 3a and 3b). The low-noise amplitudes before the P-wave onsets are followed by
an impulsive P-wave amplitude increase. This amplitude growth is similar for both small
and large magnitudes until the smaller magnitude records reach their maximum P-wave
amplitude, whereas the large magnitude records continue to grow. The later-arriving S-
and surface waves cause additional amplitude growth in the real waveforms, which is ac-
curately mimicked by the GWM synthetics. The variability of the envelopes in each bin
is symmetric around the median in log-space and is of the same order for both real and
synthetic data. Additional figures in the supplementary material show different bins, and
separate evaluations of North-South and vertical components (supplementary Figures
S1 to S7).

Figure 3: First order seismogram characteristics. (a) and (b) Distribution of time-domain
envelopes for East-West component seismograms in different magnitude bins in terms of
the mean (solid line) and the standard deviation (shaded areas) for real data and GWM
synthetics. (c) and (d) Distribution of Fourier spectra log-amplitudes for East-West com-
ponent seismograms in different magnitude bins. The sample counts for each bin, in
ascending order of magnitude, are 58.302, 42.264, 39.033, 44.232, 13.539.

–10–
manuscript submitted to JGR: Machine Learning and Computation

4.2 Fourier amplitude spectra

Similarly, we compare the logarithmic Fourier amplitude spectra of GWM synthetic


data with those of real data. These spectra are obtained by performing a Fourier trans-
form of the time-domain signals, calculating the magnitudes of the resulting complex val-
ues, and then applying a logarithmic transformation. Figures 3c and 3d illustrate these
comparisons, with similar distributions in terms of mean log-amplitudes and variabil-
ity. Equivalent evaluations for specific parameter bins are shown in supplementary Fig-
ures S2 to S7. Additionally, Section 4.5 discusses the use of Fréchet distance to compare
distributions of log-amplitude spectra for real data and GWM synthetics.

4.3 Scalar peak amplitude statistics

For earthquake engineering and seismic hazard applications, peak ground motion
amplitude statistics are of particular importance. Here, we compare how various peak
amplitude statistics of the GWM synthetics compare with the real data, how they cor-
relate with the conditioning predictor variables, and how they compare with predictions
from widely used Ground Motion Models (GMMs). Specifically, we compute peak ground
acceleration (PGA), peak ground velocity (PGV), and pseudo-spectral acceleration (SA)
for both real data and GWM synthetics. We use the orientation-independent GMRotD50
statistic (Boore et al., 2006), which represents the median of the horizontal components,
rotated over all possible rotation angles. We utilize GMMs from Boore et al. (2014), op-
timized for a global database of shallow crustal earthquakes in active tectonic regions
with M 3.0–7.9 events, and from Kanno et al. (2006), which used a database of strong
ground motion records from shallow crustal earthquakes in Japan between 1963 and 2003.

4.3.1 Accuracy of predicted peak amplitudes


For each record in the real waveform dataset, we compute a single GWM synthetic
waveform, using the conditioning parameter of the real data (magnitude, hypocentral
distance, VS30 and faulting type). Comparing the PGA and PGV values measured on
the GWM synthetics with the real data shows that the GWM predictions are at least
as accurate as those from the GMMs and have similar prediction variability (Figure 4).
Specifically, we compute the logarithm of the ratio between observed and predicted
peak amplitudes. For the GWM synthetics, the mean of the distribution of this log-ratio
(the model bias) is close to zero for both PGA and PGV: the mean model bias across
all distances is 0.08 log-units for PGA and 0.07 log-units for PGV. This corresponds to
an underprediction by 20% and 17%, respectively. At very short hypocentral distances
(< 20 km), the GWM tends to underpredict the real data more strongly, by about 45%
(0.16 logarithmic units, Figures 4a and 4b).
In comparison, the GMM by Boore et al. (2014) underestimates the real PGA for
distances > 50 km by 21 %, and by 48 % at < 20 km (Figure 4c). For PGV, this GMM
has a low average model bias of only 17 %, except at very short distances, where it is
similar to the GWM (Figure 4d). Similarly, the GMM by Kanno et al. (2006) underpre-
dicts PGA by an average of 78% (Figure 4e) across all distances and by 151% for dis-
tances < 20 km. It very accurately predicts PGV (1% under-prediction across all dis-
tances) (Figure 4f), except at short distances (< 20 km), where the under-prediction
amounts to 151%. These comparisons indicate that the peak amplitude predictions of
the GWM are largely unbiased, except at very short hypocentral distances where all mod-
els exhibit reduced performance.

–11–
manuscript submitted to JGR: Machine Learning and Computation

Figure 4: Model bias as a function of hypocentral distance for the generative waveform
model (blue), GMMs by Boore et al. (2014) (violet), and Kanno et al. (2006) (red) for
PGA (a, c, and e) and PGV (b, d, and f), with respect to real data. Colored lines repre-
sent the mean of the ratio in 50 distance bins of 3.67 km width. The bars represent the
standard deviation in each bin.

–12–
manuscript submitted to JGR: Machine Learning and Computation

4.3.2 Variability of predicted peak amplitudes

Another important criterion for ground motion prediction methods is that the vari-
ability in predicted peak amplitudes is accurately characterized. To evaluate the vari-
ability of the GWM predictions, we compare their total standard deviation to the vari-
ability in the real data, and to the predictions of the two GMMs.
To measure prediction residuals, we fit a simple, custom GMM to the PGA and
PGV of the real data, as a function of magnitude M , hypocentral distance R, and VS30 .
We use ordinary least squares and find

log10 (PGA) =0.4840 + 0.4274 × M − 0.3642 × log10 (VS30 )


(10)
− 1.3972 × log10 (R)

log10 (PGV) = − 0.9914 + 0.5392 × M − 0.6481 × log10 (VS30 )


(11)
− 1.3190 × log10 (R) .

We then calculate residuals by subtracting the predictions of equations 10 and 11


(i) from the real data (ϵData
i ), (ii) from the GWM (ϵGW
i
M
), (iii) from the GMM of Boore
Boore14
et al. (2014) (ϵi ), and (iv) from the GMM of Kanno et al. (2006) (ϵKanno06
i ). The
distribution of these residuals (Figure 5) is very similar between the real data and the
GWM for both PGA and PGV, including extreme values. This indicates that the GWM
synthetics have a similar number of waveforms with above- or below-average peak am-
plitudes as the real data set. The PGA residuals for the two published GMMs likewise
have similar variances to the real data but are slightly shifted toward negative values.
This shift reflects the larger average bias of the GMMs, as previously shown in Figure
4.
The standard deviations of the distribution of residuals are very similar for all four
cases: for PGA they are 0.398, 0.399, 0.428, and 0.404 log-units, for the real data, the
GWM synthetics, the Boore et al. (2014) GMM and the Kanno et al. (2006) GMM, re-
spectively. For PGV they are 0.356, 0.357, 0.387, and 0.368, respectively. That is, for
all four cases, about 68% of amplitudes fall between 40% (10−0.4 ) and 251% (100.4 ) of
the median predicted amplitude value.

4.3.3 Shaking durations


Shaking duration is another signal characteristic that is important for earthquake
engineering applications. We use the cumulative Arias Intensity (cAI) metric (Arias, 1970)
to compare the significant shaking duration of the GWM synthetics and of the real data.
The significant shaking duration is defined by the times corresponding to 5% and 95%
of the maximum cAI. Figure 6a shows the cAI curve for an example record from the real
data set with M 5.5, recorded at R = 50 km, on a site with VS30 = 500 m/s, and fault-
ing type 1, along with the cAI curves from 100 GWM realizations with the same con-
ditioning parameters. When we compute 1 GWM realization for each real record of the
entire data set (Figure 6b), we find very similar duration distributions between real data
and GWM synthetics, for magnitudes ≤ 7.4. For larger magnitudes, the 40 seconds length
of the generated seismograms is not sufficient to capture the full ground motion time his-
tory. An equivalent figure that includes the largest magnitudes is given in supplemen-
tary Figure S8.

4.3.4 Predicting distributions of peak amplitudes


Because we can generate any number of synthetics with the GWM, we can use the
model to predict distributions of ground motion statistics, much like we commonly would

–13–
manuscript submitted to JGR: Machine Learning and Computation

Figure 5: Histogram of (a) PGA and (b) PGV residuals showing the spread of the real
data (black), of the GWM synthetics (red), of the Boore (2003) GMM (blue), and of
the Kanno et al. (2006) GMM (green), with respect to the simple fitted ground motion
models (equations 10 and 11) on a log10 scale. The box plot shows the median values,
quantiles, and extreme values.

Figure 6: Shaking duration estimated using cumulative Arias Intensity (cAI). (a) cAI for
a real example waveform (black line) and 100 GWM synthetics (red lines). Triangle-right
and triangle-down symbols represent 5% and 95% of the maximum cAI for the real data
(white) and GWM synthetics (red), respectively. (b) Shaking duration for real data (grey
circles) and one GWM synthetic per real record (red triangles), with corresponding condi-
tioning parameters. For each magnitude bin (every 0.08) from M 4.5 - 7.4, grey dots and
lines show the mean and standard deviation of the real data, while blue triangles and lines
show the mean and standard deviation of the GWM synthetics.

–14–
manuscript submitted to JGR: Machine Learning and Computation

with GMMs. For instance, we can generate n synthetic waveforms for a set of condition-
ing parameters, and then compute the median and standard deviation of, e.g., PGA. It
takes on the order of 60 GWM realizations for the median and the standard deviation
estimates to stabilize (Figure 7a). To establish this we generate between 1 and 100 re-
alizations using M = 5.5, R = 50 km, VS30 = 500 m/s, and faulting type = 1 as con-
ditioning parameters, and analyze the median and standard deviation of the correspond-
ing PGA values. Furthermore, we compute the Shapiro-Wilk test statistic (Figure 7b),
to confirm that the peak amplitude predictions from the GWM are indeed log-normally
distributed, as is the case for real data. This test statistic compares a data distribution
to a normal distribution by evaluating (Σni=1 ai x(i) )2 /Σni=1 (xi − x̄)2 (i.e., the ratio be-
tween the square of the sorted weighted sample values to the sum of the squared sam-
ple deviations from the mean; where ai is the weight and xi is the sample). A value close
to 1 indicates normally distributed data and values close to 0 imply non-normal distri-
bution. The test values for this particular set of conditioning parameters exceed 98% at
n > 40 and then remain stable, suggesting that the model has correctly learned to gen-
erate waveforms with log-normally distributed peak amplitudes. This is a first-order char-
acteristic of real data, and it also implies that we can accurately represent the ampli-
tude distributions with only a mean and a standard deviation.

4.3.5 Pseudo-spectral acceleration versus hypocentral distance


We can use the GWM in this sense to directly compare the distributions it predicts
with predicted distributions from the GMMs. To study, for instance, how peak ampli-
tudes decay with distance, we compute 100 GWM synthetics for a vector of evenly spaced
distances, every 0.8 km between 1 km to 180 km, and calculate the median and stan-
dard deviation of the peak amplitudes in each distance bin, for fixed magnitude, VS30
values and faulting type (Figure 8).
For pseudo-spectral acceleration at 0.1 and 1.0 second periods with 5% damping,
the resulting GWM predictions decay smoothly, similar to the GMM predictions. This
includes very short distances, where both GWM and GMMs are under-constrained by
available data. For T = 0.1 s, both the GWM and the two GMMs underestimate the
real data in the VS30 = 550 − 650 m/s bin and for magnitudes in the ranges of M =
5.4 − 5.6 (Figure 8a) and M = 5.9 − 6.1 (Figure 8c), although the real data mean re-
mains within the standard deviation of the GWM predictions. For longer periods, such
as T = 1.0 s (Figures 8b and 8d), the GWM generally performs better than the GMMs.
The GWM also matches the real data rather well in cases when the data diverge from
the typical GMM decay, which is sometimes observed for distances greater than 100 km
(supplementary Figures S9-S18).

4.3.6 Pseudo-spectral acceleration versus magnitude

In a similar sense, we can investigate how the GWM predictions grow with mag-
nitude. We produce 100 GWM synthetics for a vector of magnitudes, evenly spaced ev-
ery 0.035 from M 4.5 to 8, for a fixed distance and VS30 values. The predictions for SA
at T = 0.1 s show relatively smooth, monotonous growth, up to a saturation at M 7.0
(Figure 9a), consistent with the (very sparse) real data. The same trend is observed for
other conditioning parameter combinations (supplementary Figures S19 - S24). It is in-
teresting, and encouraging, that the GWM predictions are well-behaved in condition-
ing parameter ranges where the training data are very sparse, such as at M 6.5 - 7.5,
or for R < 20 km.

4.3.7 Pseudo-spectral acceleration versus VS30

To assess the scaling of pseudo-spectral acceleration with VS30 , we generate another


100 GWM realizations for a vector of VS30 values, evenly spaced every 13.63 m/s from

–15–
manuscript submitted to JGR: Machine Learning and Computation

Figure 7: Statistics of the GWM realizations. a) Median and standard deviation of PGA
values with different numbers of GWM realizations. b) Shapiro-Wilk test statistic for dif-
ferent numbers of realizations.

–16–
manuscript submitted to JGR: Machine Learning and Computation

Figure 8: RotD50 pseudo-spectral acceleration (SA) with a damping factor 5% versus


hypocentral distance for periods (T ) of 0.1 s and 1.0 s. Median prediction of the GWM
(black line) and standard deviation (yellow shaded area), along with median prediction
(solid lines) and standard deviation (dashed lines) of the Boore et al. (2014) GMM (vi-
olet), and the Kanno et al. (2006) GMM (red), using M = 5.5 and VS30 = 600 m/s (a
and b), and M = 6.0 and VS30 = 600 m/s (c and d). The data are sampled from narrow
magnitude and VS30 bins, as written in the figure titles, and shown by their median (green
squares) and standard deviations (green lines).

Figure 9: RotD50 pseudo-spectral acceleration (SA) with a damping factor of 5% versus


magnitude (a) and versus VS30 (b) at T = 1.0 s. Median of the GWM prediction (black
lines) and its standard deviation (yellow shaded areas), using R = 60 km, for VS30 = 600
m/s in (a) and M = 5.5 in (b). The data (grey dots) are sampled from narrow magnitude,
R, and VS30 bins, as written in the figure titles.

–17–
manuscript submitted to JGR: Machine Learning and Computation

150 m/s to 1500 m/s (Figure 9b). Generally, the GWM predictions follow the real data
distribution, with SA values decreasing with increasing VS30 . Towards very low VS30 , the
GWM synthetics somewhat underpredict the strong growth of SA. Interestingly, for SA
at T ≥ 1.0 s, the SA values decrease up to VS30 = 800 m/s, then remain stable up to
approximately VS30 = 1200 m/s, and then decrease again for larger VS30 values. This
is observed for a wide range of magnitude and distance bins (supplementary Figures S25-
S30). To what extent we expect SA to correlate with VS30 somewhat depends on the known
limitations of VS30 as a site response proxy (Bergamo et al., 2019).

4.4 Model probabilities given the data


To assess the GWM and the GMMs across a wide range of magnitude and distance
combinations, we compute cumulative probabilities of the models, given the observed data.
For each observed spectral acceleration value SAi , the GWM and GMMs predict a Gaus-
sian normal distribution of expected SA amplitudes, with a predicted mean µ, and a stan-
dard deviation σ. We can then evaluate the probability of each data point under the pre-
dicted distribution. Assuming a uniform prior distribution, this probability is equiva-
lent to the probability of the model, given the data. By summing up these probabilities
across all data points in a magnitude and distance bin, and then normalizing it with the
number of data in the bin N , we can compute an average model probability
N
(SAi − µk )2
 
1 X 1
Pk = √ exp − , (12)
N i 2πσk 2σk2

where µk and σk are the predicted mean and standard deviation for the k th bin. This
probability is high only if the model accurately predicts both the mean and standard de-
viation of the data. Therefore, with these probabilities, we can readily assess the agree-
ment between the data and model predictions, for a large number of conditioning pa-
rameter combinations.
We compute these model probabilities for the GWM (PGW M ) and for the two GMMs
(PGM M ). We use the GMMs to compute the mean SA for the center of each bin, and
use their reported standard deviations. For the GWM, we use 100 GWM realizations
to compute the mean and standard deviation for each bin, likewise using the magnitude
and distance of each bin center. We use faulting type = 1, and repeat the computations
for a number of VS30 values, to compute the average probabilities for each bin after Eq.
12. Figure 10 shows the model average probabilities for the GMM by Kanno et al. (2006)
(top row) and the GWM (middle row). To compare the two models we compute the prob-
ability ratio r = PGM M /PGW M (bottom row). Equivalent figures for the Boore et al.
(2014) GMM are shown in supplementary Figures S31, S35, and S39.
In general, for the GWM and both GMMs, the average probabilities are highest
for low magnitudes and large hypocentral distances, and lowest for large magnitudes and
short recording distances. For VS30 = 240 m/s and SA with T = 1.0 s (Figure 10a),
the GMM by Kanno et al. (2006) shows comparable accuracy to the GWM in most bins,
except for R > 100 km. The GMM by Boore et al. (2014) is comparable to the GWM
for small magnitudes and short hypocentral distances, or large magnitudes and large hypocen-
tral distances (supplementary Figure S31a). At periods T = 1.0 s, the GWM performs
better than any of the GMMs considered in this study (supplementary Figures S31 and
S32). These patterns are also observed for higher VS30 values (Figures 10b,10c and sup-
plementary Figures S33 - S40).

4.5 Fréchet Distances and classifier accuracy

In addition to the commonly used seismological performance metrics, we introduce


additional, novel metrics that quantify the similarity between real and generative wave-
form model (GWM) synthetic waveforms. The metrics are inspired by well-established

–18–
manuscript submitted to JGR: Machine Learning and Computation

Figure 10: Average model probabilities given the SA data at T = 1.0 s, in bins of magni-
tude and R, for VS30 = 240 m/s (a), 800 m/s (b), and 1360 m/s (c), for the Kanno et al.
(2006) GMM (top row) and the GWM (middle row). The ratio between the two probabil-
ities (bottom row) shows which model explains the observed SA data better.

performance metrics from the image generation community, and could play an impor-
tant role for a systematic and quantitative comparison of various proposed GWMs, e.g.
in the framework of future community efforts for GWM evaluations.

4.5.1 Fréchet Distance of Fourier amplitude spectra


We use the Fréchet Distance (FD) to measure the distance between the Fourier am-
plitude spectra of the observed and synthetic waveforms. Ideally, realistic GWM syn-
thetic waveforms should have a Fourier spectrum that is statistically indistinguishable
from that of the real data. Larger FD values would indicate differences between the GWM
synthetics and real signals. The FD is equivalent to the Wasserstein-2 distance, and mea-
sures the minimum effort required, in the L2 sense, to transform one distribution into
another. As in section 4.3.1, we compute a GWM synthetic waveform for each real wave-
form in the data set, using the conditioning parameter corresponding to the real data.
The amplitude spectrum of each waveform is represented by 2033 amplitude values (1/2
signal length + 1). We treat each vector element as an independent Gaussian, and com-
pute the element-wise mean µ and standard deviation σ of the log-amplitudes for each
vector element, across all waveforms. The Fréchet Distance d between observed and syn-
thetic waveforms is then given by:

d2 = ∥µobs − µsyn ∥2 + ∥σobs − σsyn ∥2 . (13)

Computing the FD for the entire data set, we find that it ranges from 64 to 78 for
the three spatial components of the seismograms (Table 1). To provide a baseline for these
FD values, we also compute the FD between the training and test sets (Section 3), i.e.
between sub-sets of the real data. These resulting FDs are 3 to 4 times smaller. This pro-

–19–
manuscript submitted to JGR: Machine Learning and Computation

vides a baseline for ideal model performance, and suggests that, despite the good per-
formance shown in sections 4.1 - 4.4, there are significant differences between real and
synthetic spectra and that there is, therefore, room for model improvement. Furthermore,
we also use this FD metric to show that the model performance decreases substantially
if we leave out the auto-encoder, or if we choose a signal representation other than spec-
trograms (see ablation studies in Appendix B).
Furthermore, we can use the FD to compare the waveform generation for differ-
ent magnitude and distance bins (Figure 11a and supplementary Figure S42). The FDs
are systematically higher for larger magnitude and shorter distance recordings, indicat-
ing a poorer fit between real and GWM synthetic data. This may be due to the relative
scarcity of training data, and/or due to the inherently higher complexity of these records,
making them more challenging for a model to mimic.

Fourier spectra FD ↓ Classifier


Distribution East-West North-South Vertical Accuracy (%) ↑ Embedding FD ↓
GWM synthetics vs real data 77.38 64.32 71.87 44.48 5.51
Test vs training data 19.80 19.70 21.47 57.67 0.24

Table 1: Fréchet Distances (FD) of the Fourier amplitude spectra, classifier accuracies
and FDs of classifier embeddings, between real data and GWM synthetics. The metrics
are computed between the real data and corresponding GWM synthetics (first row), and
between the training and test sets (second row).

a) Fourier Spectra FD Classification Accuracy Embedding FD


b) c)

Figure 11: Fréchet Distance (FD) between real data and GWM synthetics in bins of
magnitude and hypocentral distance, for the Fourier amplitude spectra of the East-West
component seismograms (a), classifier accuracy on the GWM synthetics (b), and FD of
the classifier embeddings between real data and GWM synthetics, in bins of magnitude
and hypocentral distance (c).

4.5.2 Classification accuracy

Inspired by the common practice in image synthesis to evaluate the quality of gen-
erated images with a pre-trained classifier (Heusel et al., 2017), we adopt a similar ap-
proach: we train a classifier to categorize seismic data into bins of magnitude and dis-
tance. We divide our dataset into five magnitude and five distance bins, resulting in 25
classes, with each class containing a similar number of samples (Appendix A3). We then
train a classifier to predict the magnitude-distance bin for each record. For the classi-
fier, we use a convolutional neural network (CNN) architecture that, like the GWM, op-
erates on spectrogram representations of the waveforms (Appendix A1). The classifier

–20–
manuscript submitted to JGR: Machine Learning and Computation

achieves a test accuracy of 57.67%, correctly predicting the magnitude-distance bin for
this fraction of real waveforms. When applied to synthetic waveforms, the classifier’s ac-
curacy is 44.48% (Table 1). Ideally, if synthetic waveforms were indistinguishable from
real ones, the classifier would maintain the same accuracy for both. Although not per-
fect, the classifier’s performance on synthetic waveforms strongly exceeds the 4% = 1/25
accuracy expected from random guessing, indicating that the synthetic waveforms en-
capsulate substantial information about first-order statistics, enabling the classifier to
make informed predictions. Figure 11b illustrates the accuracy achieved on the gener-
ated dataset across different magnitude-distance bins. Notably, some of the most diffi-
cult parameter ranges have high classification accuracy.

4.5.3 Fréchet Distance of classifier embeddings

For similar input data, not only should the classifiers’ performance be compara-
ble, but its internal representations should also align. Based on this intuition, we use the
FD to measure the similarity of the classifiers’ hidden representations between the real
and synthetic waveforms. We collect the 256-dimensional hidden representations from
the penultimate layer of the classifier for both real and synthetic waveforms. Unlike the
FD of Fourier amplitude spectra (Section 4.5.1) where each dimension was treated in-
dependently, the reduced dimensionality of this representation (256 instead of 2033) al-
lows us to compute correlations between dimensions. Thus, we calculate the entire co-
variance matrix Σ of the hidden representations for both the real and generated wave-
forms. The Fréchet Distance D between the two sets of hidden representations is then
given by:
D2 = ∥µobs − µgen ∥2 + Tr(Σobs + Σgen − 2(Σobs Σgen )1/2 ). (14)
This is a generalization of the Fréchet Distance in Eq. 13 to general multivariate (non-
isotropic) Gaussians. In image generation literature, this metric is known as the Fréchet
Inception Distance (FID) (Heusel et al., 2017), with ”Inception” referring to the clas-
sifier architecture employed.
This FD of classifier embeddings is about 20 times larger between the GWM syn-
thetics and the real data, than it is between test and training data (Table 1). When com-
puted separately for the magnitude-distance bins, we see again how the model performs
worse for larger magnitude records (Figure 11c), and how the presented model performs
significantly better than the ablated models evaluated in Appendix B.
In summary, all three metrics provide an objective, relative measure of synthesis
quality. As such they can readily be used to e.g. compare different proposed GWMs, model
versions, or to evaluate the model performance on a particular subset of the data. Im-
portantly, ideal lower bounds for the FD, and ideal upper bounds for the classifier can
be estimated by computing the metrics using just real data (second row in Table 1).

5 Discussion
Generative Waveform Models (GWMs) are rapidly advancing and have the poten-
tial to significantly improve earthquake hazard assessment and earthquake engineering
studies (Florez et al., 2022; Esfahani et al., 2023; Shi et al., 2024; Matsumoto et al., 2024).
Unlike GMMs, which predict scalar ground motion metrics, GWMs can synthesize fully
realistic waveforms, complete with realistic frequency- and time-domain properties.
The ability to predict full waveforms enables studies that rely on waveform con-
tent, such as building response simulations (Bommer & Acevedo, 2004). Any scalar ground
motion metric can be derived from the predicted waveforms. A key advantage of this ap-
proach is that the waveforms and their derivatives are equally realistic across the entire
frequency range (1 - 50 Hz). This may contrast with hybrid methods, which add high-
frequency spectra in a separate second stage, either using stochastic methods (Saikia &

–21–
manuscript submitted to JGR: Machine Learning and Computation

Somerville, 1997; Mai et al., 2010; Graves & Pitarka, 2010) or neural networks (Paolucci
et al., 2018; Gatti & Clouteau, 2020; Okazaki et al., 2021). The former can introduce
artifacts at the transition between the deterministic and stochastic parts of the predic-
tions (Mai & Beroza, 2003; Graves & Pitarka, 2010; Tang & Mai, 2023).
Another advantage of GWMs is their ability to accurately learn the correlation be-
tween scalar statistics of the same waveform (e.g., spectral accelerations at different fre-
quency ranges). This factor has been considered in only a few GMMs to date (Baker &
Bradley, 2017; Baker et al., 2021).
This study introduces a new GWM that builds on the recent successes of Denois-
ing Diffusion Models in image, audio, and video generation (Song & Ermon, 2019; Ho
et al., 2020; Song et al., 2021; Dhariwal & Nichol, 2021; Nichol & Dhariwal, 2021; Kong
et al., 2021; Rombach et al., 2022; Ho et al., 2022). Diffusion models are capable of gen-
erating high-resolution and diverse samples while being simple to implement and train.
Unlike GANs (Florez et al., 2022; Esfahani et al., 2023; Matsumoto et al., 2024) used
for synthesizing waveforms, denoising diffusion models do not suffer from mode collapse
or related training challenges. Additionally, no dedicated neural network architecture
is required, as with neural operators (Shi et al., 2024).

The proposed GWM operates on the spectrogram representation of waveform data


to address scaling issues associated with the high variance of seismic waveforms. It in-
corporates an autoencoder to compress these spectrograms into a more compact form,
thereby enhancing the efficiency of both the training and generation processes while si-
multaneously improving the resolved frequency of the generated waveforms. This effi-
ciency may also lead to better model performance in scenarios where data scarcity is a
concern, a common issue in many earthquake seismology problems. Moreover, it is im-
portant for applications where large numbers of forward computations are required, such
as in probabilistic seismic hazard assessment.
The proposed generative latent denoising diffusion model generates highly realis-
tic waveforms across the entire hazard-relevant frequency range. The GWM synthetics
have realistic time domain shapes (Figure 3a), Fourier amplitude spectra (Figure 3c),
and shaking durations (Figure 6). The predicted peak amplitude statistics are largely
unbiased (Figure 4) and have log-normally distributed amplitude variation (Figure 7),
with the same amount of variability as the real data (Figure 5),
With repeated inference for the same conditioning parameter sets, GWMs can be
used to predict distributions of scalar amplitude statistics, like is commonly done with
GMMs. We show that our model predicts peak amplitudes of strong motion seismic data
in Japan at least as accurately and as precisely as two prominent and adequate GMMs.
While GWM-based predictions are computationally more expensive than GMM com-
putations, they offer new possibilities for both practical and scientific applications. For
instance, next-generation seismic hazard models could include model branches contain-
ing a catalog of representative waveforms - rather than just amplitude statistics - which
are expected over the hazard target period. This would provide a much more detailed
description of the anticipated ground motion in a target region. It would also expand
the applicability of hazard calculations to use cases where full waveforms are required
or desirable, such as non-linear structural dynamic analyses.

GWMs could eventually eliminate the need to re-scale limited data pools of observed
seismic waveforms to match expected target spectra, or to use hybrid methods for en-
riching simulated waveforms with high-frequency seismic energy. Instead, a set of fully
realistic broadband waveforms can be generated from a single, self-consistent synthesis
process.

–22–
manuscript submitted to JGR: Machine Learning and Computation

One important question in this context will be which among existing and future
GWMs generates the most realistic waveforms for a particular application. To facilitate
quantitative and representative model comparisons, we propose that the emerging GWM
community embrace open model and code standards and participate in existing commu-
nity efforts for the comparison and evaluation of ground motion models (e.g., Maechling
et al. (2015)). Performance metrics like the ones introduced in Section 4.5 could facil-
itate meaningful comparison between models and model versions. Such model compar-
ison efforts could also include blind signal classification exercises, where trained classi-
fier models would attempt to distinguish between real and synthetic waveforms.

5.1 This-Quake-Does-Not-Exist (’tqdne’) Python Library


For the model presented in this study, we introduce an openly available and user-
friendly Python library that can be used to generate waveforms using the pre-trained
GWM from this study or to train custom GWMs. Generating a three-component wave-
form with the pre-trained GWMs takes a fraction of a second on a standard personal com-
puter. The library facilitates saving of the waveforms in SeisBench format (Woollam et
al., 2022). The library’s named is inspired by the popular thispersondoesnotexist.com
(2023) application, which uses the StyleGAN algorithm (Karras et al., 2019, 2020) to gen-
erate human portrait images.

5.2 Limitations
While the presented GWM arguably achieves high seismic waveform synthesis per-
formance, there are several limitations, that future models can aspire to overcome.

Stochastic nature of the generated waveforms Fundamentally, the generated wave-


forms are stochastic representations of real seismograms. There is no underlying phys-
ical model for wave excitation and propagation. Although the GWM synthetics exhibit
clear energy packets that closely resemble P-, S- and surface waves, they do not repre-
sent any wavefield phases in a deterministic sense.

Limited training data As is the case for all models of strong ground motion, the
limited number of short distance recordings of large magnitude earthquakes is a bottle-
neck. This limitation affects the synthesis performance of this crucial data regime. GWMs
can in principle be used to augment such data sets, but it is currently an open question
how well the models extrapolate beyond the parameter ranges for which they have been
trained, and how well they perform at the data-scarce edges of the parameter ranges.

Point source assumption Our model assumes that the earthquake source is a point
source and neglects finite fault source characteristics such as fault geometry and distance,
source roughness, directivity, and unilateral or bilateral rupture modes.

Uncorrelated stations The current model does not explicitly take into account
the correlation of observations across different records of the same quake. Each gener-
ated waveform is an independent realization of the denoising forward process. This may
lead to an underestimation of the correlation of observed ground motions from the same
quake, and might limit the ability of the model to generalize to new stations.

P-wave onset times The current model has been trained with a data set of wave-
forms that have been aligned with a simple STA/LTA onset detector, which can be in-
accurate. As a consequence, the GWM synthetics also have some variability in the P-
wave onset times that is not physically meaningful.

Signal length The 40-second long seismograms are sufficient to describe the ground
motion from quakes with magnitudes of up to ∼ 7.5. For even larger quakes, the source

–23–
manuscript submitted to JGR: Machine Learning and Computation

duration alone may exceed this signal length. Producing longer sequences without com-
promising temporal resolution presents some challenges, even for our efficient model. Ad-
dressing this issue may necessitate an approach with more favorable asymptotic behav-
ior, which is a subject for future research.

Lower spectral amplitude Our model slightly underestimates the spectral ampli-
tude of the ground motion compared to the real data (Figures 3b and 4a). This discrep-
ancy is observed exclusively in the model operating on the spectrogram representation.
We hypothesize that this is due to the model generating a slightly blurred version of the
encoded spectrograms, similar to the effect of a Gaussian filter. While this blurring is
inconsequential for image generation tasks, as it is imperceptible to the human eye, it
may result in a lower spectral amplitude (e.g., averaging 0.04 m/s2 Hz−1 at frequency <
30 Hz, (supplementary Figure S41)). Partially, the underestimation may also stem from
the spectrogram autoencoder (Figure B1c). A potential solution could involve incorpo-
rating additional loss terms, such as adversarial loss, to encourage the model to gener-
ate sharper spectrograms. These could for instance be included in the autoencoder stage.
Alternatively, exploring different, potentially smoother spectral representations that are
less sensitive to blurring may also be beneficial.

6 Conclusion
We present a data-driven, conditional generative model for synthesizing three-component
strong motion seismograms. Our generative waveform model (GWM) combines a con-
volutional auto-encoder with a state-of-the-art latent denoising diffusion model, which
generates encoded - rather than raw - spectrogram representations of the seismic signals.
We trained the openly available model on Japanese strong motion data with hypocen-
tral distances of 1–180 km, moment magnitudes ≥ 4.5, and VS30 values of 76–2100 m/s.
Using a variety of commonly used and novel evaluation metrics, we demonstrate that the
GWM synthetics accurately capture the statistical properties of the observed data in both
the time and frequency domains, across a wide range of conditioning parameters, and
up to the highest hazard-relevant frequencies.

Furthermore, we systematically compare the peak ground motion statistics of the


GWM synthetics to predictions from commonly used GMMs. The GWM predictions are
largely unbiased and exhibit the same level of amplitude variability as the real data. As
a result, they may be useful for practical applications, such as probabilistic seismic haz-
ard assessment and structural dynamic analyses.

With GWMs, hazard models can potentially expand their scope to include appli-
cations that require full waveform representations, rather than just scalar amplitude statis-
tics. Future community efforts to benchmark and compare GWMs would provide guid-
ance for which models to best use in practical and scientific applications, and may ac-
celerate GWM innovation.

Appendix A Generative model details


This section provides additional details on the model architectures and represen-
tations used in the experiments.

A1 Neural network architectures


All our models are based on the widely used U-Net architecture presented in Song
et al. (2021). The U-Net consists of three components which we denote left (encoder),
middle, and right (decoder). All components use several residual blocks that use two con-
volutional layers (2D for the spectrogram and 1D for the moving average envelope), with

–24–
manuscript submitted to JGR: Machine Learning and Computation

pre-layer group normalization and SiLU activation functions. In addition, the downsam-
pling component uses a convolutional layer to reduce the dimensionality of an input be-
tween each pair of residual blocks, while the upsampling component uses upsampling op-
erations to double the dimensionality of an input between each pair of residual blocks.
An overview of the architectures is given in Table A1.
Denoising Diffusion: The neural network for the diffusion model uses four resid-
ual blocks on encoder and decoder components, with an additional residual block in the
middle. After the first three levels, we include a downsampling operation on the encoder
an upsampling operation on the decoder side. In addition, the central blocks incorpo-
rate a self-attention module. As per convention, conditioning information is injected within
each residual block by concatenating projections of the conditioning vector c and the time
embedding t to the intermediate representations of an input zt (see Figure A1 and Song
et al. (2021); Karras et al. (2022)).

inputst embeddingt

LayerNormalization

SiLU
t c
Conv2D Linear
FourierProjection FourierProjection
Reshape
MLP MLP
LayerNormalization

SiLU

embeddingt Dropout

Conv2D

outputst

(a) Preprocessing of conditioning information. (b) Residual layer.

Figure A1: Using low-dimensional features for conditioning the diffusion model. (a) For
both the scalar value t and the four-dimensional feature vector c, we first compute 256-
dimensional Fourier features and embed both via separate MLP neural networks. The
two embeddings are combined by simply adding them elementwise. (b) For each residual
block, we condition the synthesized spectrogram (inputst ) using the combined time-
feature embedding by adding them to the hidden representation of the residual layer. For
that, we take the 256-dimensional embedding vector, transform it through a linear layer
with K output neurons, and reshape it to match the size of the hidden representation.
Concretely, if the hidden representation has dimensionality N × H × W × K where N is
the batch size and H × W is the spectrogram size, we repeat the embedding N × H × W
times, reshape the resulting tensor to match the dimensionality of the hidden represen-
tation, and add the hidden representation and conditioning information elementwise (in
deep learning libraries like PyTorch this can be efficiently done).

Autoencoder: The autoencoder architecture comprises the same blocks as the


denoising model but lacks self-attention modules and only uses 3 residual blocks. It uses

–25–
manuscript submitted to JGR: Machine Learning and Computation

downsampling and upsampling operations between each residual block. As a consequence,


the autoencoder compresses the input by a factor of four in each spatial dimension.
Classifier: The classifier is a convolutional neural network consisting of four resid-
ual blocks, each followed by a downsampling operation. It includes a self-attention layer
at the end, followed by a global average pooling operation, an output multi-layer per-
ceptron (MLP), a linear layer, and a softmax activation function. When extracting em-
beddings from the classifier, we utilize the output of the MLP prior to the linear layer.
The classifier is trained on the spectrogram representation of the data.

Hyperparameters Moving Average Diffusion Moving Average Latent Diffusion Spectrogram Diffusion Spectrogram Latent Diffusion Classifier
Autoencoder Diffusion Model Autoencoder Diffusion Model
Convolution Kernel Size 5 5 5 3×3 3×3 3×3 3×3
Hidden Channels [64, 128, 256, 256] [64, 128, 256] [64, 128, 256, 256] [64, 128, 256, 256] [64, 128, 256] [64, 128, 256, 256] [64, 128, 256, 256]
Attention Levels [4] - [4] [4] - [4] [4]
Dropout Rate 0.1 0.1 0.1 0.1 0.1 0.1 0.1
KL Weight - 10−6 - - 10−6 - -
Optimizer Adam Adam Adam Adam Adam Adam Adam
Learning Rate 10−4 10−4 10−4 10−4 10−4 10−4 10−4
EMA Decay 0.999 0.999 0.999 0.999 0.999 0.999 0.999
Batch Size 320 64 1536 320 64 2048 128
Epochs 300 200 300 300 200 300 100

Table A1: Hyperparameters for the various models used in the experiments.

A2 Representations
We experiment with two different representations of the seismic data: spectrogram
and moving average envelope.
Spectrogram Representation: To transform each of the three channels in the
original waveform into a spectrogram, we utilize a Short-Time Fourier Transform (STFT)
with 256 frequency bins and a hop length of 32 samples. Due to the symmetry of the
spectrogram, only half of the frequency bins are used. To prevent padding issues, the
original waveform is truncated to 4064 samples, resulting in a complex-valued matrix of
size 128×128. We then take the magnitude of this matrix and apply a logarithmic trans-
formation to obtain the spectrogram, discarding the phase information due to its high-
frequency nature, which is challenging to model accurately. To reconstruct the original
waveform, we employ the Griffin-Lim algorithm (Griffin & Lim, 1984; Perraudin et al.,
2013), which reliably estimates the phase from the magnitude spectrogram.

Magnitude Phase
0 0
0.05
32 32
Frequency bins

Frequency bins
Amplitude

0.00
64 64

−0.05 96 96

0 10 20 30 40
Time [s] 0 32 64 96 0 32 64 96
Time bins Time bins

(a) Waveform (b) Spectrogram

Figure A2: Seismic waveform and its corresponding spectrogram representation.

Moving Average Envelope Representation: The moving average envelope is


computed by convolving the absolute waveform signal with an averaging boxcar filter

–26–
manuscript submitted to JGR: Machine Learning and Computation

of length 128. The final representation is the concatenation of the original waveform di-
vided by the envelope and the logarithm of the envelope.

A3 Classifier training data binning


When training the classifier to categorize data based on earthquake magnitude and
distance, we divide the data into five magnitude and five distance bins. Figure A3 dis-
plays the sample count in each bin, ensuring a balanced sample distribution across classes.

50000
0
200
175

75 100 125 150 200


150 8676 8432 8738 11501 4568 12000
Distance [km]

125
11858 9406 8984 10437 3202

Distance bin [km]


10000
100
75 13618 9698 8895 9454 2515 8000
50 6000
12210 8165 6859 7023 1749
25 4000
0 11940 6563 5557 5817 1505
2000
5 6 7 8 9 010000
0

4.5 4.75 5 5.25 6 9.1


Magnitude Magnitude bin
(a) Dataset distribution. (b) Number of samples in each class.

Figure A3: Binning of the data into classes for the classifier, based on magnitude and
distance.

Appendix B Ablation Studies


This section evaluates the significance of the three components in our proposed model.
The time-domain representation of seismic data shows significant amplitude variation,
making direct processing of raw waveforms ineffective. Figure B1a illustrates the log-
arithm of the Fourier amplitude spectrum between an autoencoder’s input and output
when trained on raw waveforms, revealing poor reconstruction, especially in the high-
frequency range. It is important to note that this is not a generative model but an au-
toencoder tasked to reconstruct the input. Despite this, the waveforms are poorly re-
constructed, particularly in the high-frequency range.
To address this, we explore alternative time-domain representations. We decom-
pose the signal into its positive envelope and residual signal, with the envelope being a
smoothed version of the absolute signal, as detailed in Section A2. Figure B1b shows that
an autoencoder trained on this representation performs better than one trained on raw
waveforms but still struggles with high-frequency components.
Thirdly, we experiment with a spectrogram representation, specifically the log-transformed
magnitude of the short-time Fourier transform. This representation is smooth and ro-
bust to amplitude variations. Figure B1c demonstrates that an autoencoder trained on
this representation nearly perfectly reconstructs the original signal, with the exception
of a minor underestimation of the spectral amplitude, as discussed in Section 5. This high-

–27–
manuscript submitted to JGR: Machine Learning and Computation

lights the effectiveness of the spectrogram representation for the task of ground motion
synthesis.
Finally, we assess the performance of the diffusion model trained on different rep-
resentations and the impact of incorporating the autoencoder stage. Table B1 summa-
rizes our findings, showing that the spectrogram representation significantly outperforms
the envelope representation across all metrics. Additionally, the autoencoder stage im-
proves the spectral fit. Overall, the latent diffusion approach with the spectrogram rep-
resentation is the most effective configuration.

Predicted Predicted
0 Target 0 Target
Log-Amplitude [m/s2 H 1]

Log-Amplitude [m/s2 H 1]
2 2

4 4

6 6

8 8
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]

(a) Raw waveform (b) Envelope

Predicted
0 Target
Log-Amplitude [m/s2 H 1]

2
4
6
8
0 10 20 30 40 50
Frequency [Hz]

(c) Spectrogram

Figure B1: Fourier spectra log-amplitude comparison between the input and output of
an autoencoder trained on different representations. The East-West component of the
three-channel signal is used for visualization.

–28–
manuscript submitted to JGR: Machine Learning and Computation

Fourier spectra FD ↓ Classifier


Representation Latent space East-West North-South Vertical Accuracy (%) ↑ Embedding FD ↓
Envelope ✗ 2288.64 2246.33 2185.99 8.39 320.90
Envelope ✓ 187.95 192.35 190.28 8.33 491.98
Spectrogram ✗ 828.41 780.76 867.67 45.82 12.38
Spectrogram ✓ 60.98 47.84 56.73 44.48 5.51

Table B1: Ablation study comparing the performance of the diffusion model when trained
on the moving average envelope and the spectrogram representation. Results are shown
for both direct training on representations and training on the latent space of an autoen-
coder. The Fréchet Distance (FD) for the log-amplitude Fourier spectra and classifier
embeddings is reported between the full data distribution and the generated samples.
Classifier accuracy is reported for the generated samples.

–29–
manuscript submitted to JGR: Machine Learning and Computation

Open Research Section


The three-component strong-motion data from the Kiban-Kyoshin (KiK-net) net-
work time series waveforms are provided by the National Research Institute for Earth
Science and Disaster Prevention of Japan and can be downloaded at https://www.bosai
.go.jp/e/index.html. Preprocessed data that was used to train the model is available
upon request owing to K-Net and KiK-net dataset policy.

The code for our generative waveform model is available on GitHub at https://
github.com/highfem/tqdne/tags (Bergmeister et al., 2024). All online pages were last
accessed on October 16th , 2024. The supplementary material provides additional infor-
mation and figures to complement the main content of the primary text, offering a deeper
understanding and further validation of the presented results.

Declaration of Competing Interest


The authors declare that there are no conflicts of interest.

Acknowledgments
This work was supported by grant number C22-10 (HighFEM) of the Swiss Data Sci-
ence Center (SDSC), Ecole Polytechnique Fédérale de Lausanne and ETH Zürich awarded
to M-A. Meier, L. Ermert and M. Koroni. L. Ermert is supported by Swiss National Sci-
ence Foundation grant 209941. M. Koroni is supported by the Swiss Federal Nuclear Safety
Inspectorate (ENSI) under contract number CTR00830. We thank Donat Fäh and Paolo
Bergamo for useful discussions on ground motion models and earthquake engineering.
We thank CSCS Swiss National Computing Center (Piz Daint under projects sd28 and
s1165) and Swiss Seismological Service “Bigstar” Cluster for providing computational
resources for this research.

References
Applied Technology Council. (2009). Quantification of building seismic performance
factors. US Department of Homeland Security, FEMA.
Aquib, A. T., & Mai, P. M. (2024, 09). Broadband Ground-Motion Simulations
with Machine-Learning-Based High-Frequency Waves from Fourier Neural
Operators. Bulletin of the Seismological Society of America.
Arias, A. (1970). A measure of earthquake intensity. Seismic design for nuclear
plants, 438–483.
Baker, J., & Allin Cornell, C. (2006). Spectral shape, epsilon and record selection.
Earthquake Engineering & Structural Dynamics, 35 (9), 1077–1095.
Baker, J., & Bradley, B. (2017). Intensity measure correlations observed in the
nga-west2 database, and dependence of correlations on rupture and site param-
eters. Earthquake Spectra, 33 (1), 145–156.
Baker, J., Bradley, B., & Stafford, P. (2021). Seismic hazard and risk analysis. Cam-
bridge University Press.
Bayless, J., & Abrahamson, N. A. (2019). An empirical model for the interfrequency
correlation of epsilon for Fourier amplitude spectra. Bulletin of the Seismologi-
cal Society of America, 109 (3), 1058–1070.
Bergamo, P., Hammer, C., & Fäh, D. (2019). SERA WP7/NA5 - Deliverable 7.4:
Towards improvement of site condition indicators (Report). Zurich: ETH
Zurich. doi: 10.3929/ethz-b-000467564
Bergmeister, A., Palgunadi, K. H., Bosisio, A., Ermert, L., Koroni, M., Perraudin,
N., . . . Meier, M.-A. (2024). Software package ”tqdne” for paper titled ”High
Resolution Seismic Waveform Generation using Denoising Diffusion”. Zenodo.
doi: 10.5281/zenodo.13952381

–30–
manuscript submitted to JGR: Machine Learning and Computation

Bommer, J., & Acevedo, A. (2004). The use of real earthquake accelerograms as
input to dynamic analysis. Journal of Earthquake Engineering, 8 (spec01), 43–
91.
Boore, D. M. (2003). Simulation of ground motion using the stochastic method.
Pure and applied geophysics, 160 , 635–676.
Boore, D. M., & Joyner, W. B. (1997). Site amplifications for generic rock sites. Bul-
letin of the seismological society of America, 87 (2), 327–341.
Boore, D. M., Stewart, J. P., Seyhan, E., & Atkinson, G. M. (2014). Nga-west2
equations for predicting pga, pgv, and 5% damped psa for shallow crustal
earthquakes. Earthquake Spectra, 30 (3), 1057–1085.
Boore, D. M., Watson-Lamprey, J., & Abrahamson, N. A. (2006). Orientation-
independent measures of ground motion. Bulletin of the seismological Society
of America, 96 (4A), 1502–1511.
Chopra, A. K. (2007). Dynamics of structures. Pearson Education India.
Défossez, A., Copet, J., Synnaeve, G., & Adi, Y. (2023). High fidelity neural audio
compression. Transactions on Machine Learning Research.
Derras, B., Bard, P.-Y., Cotton, F., & Bekkouche, A. (2012). Adapting the neural
network approach to pga prediction: An example based on the kik-net data.
Bulletin of the Seismological Society of America, 102 (4), 1446–1461.
Dhariwal, P., & Nichol, A. (2021). Diffusion models beat gans on image synthesis.
Advances in neural information processing systems, 34 , 8780–8794.
Douglas, J. (2003). Earthquake ground motion estimation using strong-motion
records: a review of equations for the estimation of peak ground acceleration
and response spectral ordinates. Earth-Science Reviews, 61 (1-2), 43–104.
Douglas, J., & Aochi, H. (2008). A survey of techniques for predicting earthquake
ground motions for engineering purposes. Surveys in geophysics, 29 , 187–220.
Esfahani, R. D., Cotton, F., Ohrnberger, M., & Scherbaum, F. (2023). TFCGAN:
Nonstationary ground-motion simulation in the time-frequency domain using
conditional generative adversarial network (cgan) and phase retrieval methods.
Bulletin of the Seismological Society of America, 113 (1), 453–467.
Esfahani, R. D., Vogel, K., Cotton, F., Ohrnberger, M., Scherbaum, F., &
Kriegerowski, M. (2021). Exploring the dimensionality of ground-motion
data by applying autoencoder techniques. Bulletin of the Seismological Society
of America, 111 (3), 1563–1576.
Florez, M. A., Caporale, M., Buabthong, P., Ross, Z. E., Asimaki, D., & Meier, M.-
A. (2022). Data-driven synthesis of broadband earthquake ground motions
using artificial intelligence. Bulletin of the Seismological Society of America,
112 (4), 1979–1996.
Gatti, F., & Clouteau, D. (2020). Towards blending physics-based numerical sim-
ulations and seismic databases using generative adversarial network. Computer
Methods in Applied Mechanics and Engineering, 372 , 113421.
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S.,
. . . Bengio, Y. (2014). Generative adversarial nets. In Advances in neural
information processing systems.
Graves, R., & Pitarka, A. (2010). Broadband ground-motion simulation using a
hybrid approach. Bull Seismol Soc Am, 100 (5A), 2095–2123. doi: 10.1785/
0120100057
Griffin, D., & Lim, J. (1984). Signal estimation from modified short-time Fourier
transform. IEEE Transactions on Acoustics, Speech, and Signal Processing,
32 (2), 236–243.
Hartzell, S., Harmsen, S., Frankel, A., & Larsen, S. (1999). Calculation of broad-
band time histories of ground motion: Comparison of methods and validation
using strong-ground motion from the 1994 northridge earthquake. Bulletin of
the Seismological Society of America, 89 (6), 1484–1504.
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., & Hochreiter, S. (2017).

–31–
manuscript submitted to JGR: Machine Learning and Computation

GANs trained by a two time-scale update rule converge to a local Nash equi-
librium. In Advances in neural information processing systems.
Ho, J., Jain, A., & Abbeel, P. (2020). Denoising diffusion probabilistic models. In
Advances in neural information processing systems.
Ho, J., Salimans, T., Gritsenko, A., Chan, W., Norouzi, M., & Fleet, D. J. (2022).
Video diffusion models. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave,
K. Cho, & A. Oh (Eds.), Advances in neural information processing systems
(Vol. 35, pp. 8633–8646). Curran Associates, Inc.
Jayalakshmi, S., Dhanya, J., Raghukanth, S., & Mai, P. M. (2021). Hybrid broad-
band ground motion simulations in the indo-gangetic basin for great himalayan
earthquake scenarios. Bulletin of Earthquake Engineering, 19 , 3319–3348.
Jozinović, D., Lomax, A., Štajduhar, I., & Michelini, A. (2022). Transfer learning:
Improving neural network based prediction of earthquake ground shaking for
an area with insufficient training data. Geophysical Journal International ,
229 (1), 704–718.
Kanno, T., Narita, A., Morikawa, N., Fujiwara, H., & Fukushima, Y. (2006). A new
attenuation relation for strong ground motion in japan based on recorded data.
Bulletin of the Seismological Society of America, 96 (3), 879–897.
Karras, T., Aittala, M., Aila, T., & Laine, S. (2022). Elucidating the design space of
diffusion-based generative models. In Advances in neural information process-
ing systems.
Karras, T., Laine, S., & Aila, T. (2019). A style-based generator architecture for
generative adversarial networks. In Proceedings of the ieee/cvf conference on
computer vision and pattern recognition (pp. 4401–4410).
Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., & Aila, T. (2020).
Analyzing and improving the image quality of stylegan. In Proceedings of
the ieee/cvf conference on computer vision and pattern recognition (pp. 8110–
8119).
Katsanos, E., Sextos, A., & Manolis, G. (2010). Selection of earthquake ground
motion records: A state-of-the-art review from a structural engineering per-
spective. Soil dynamics and earthquake engineering, 30 (4), 157–169.
Kingma, D. P., & Welling, M. (2014). Auto-Encoding Variational Bayes. In Interna-
tional conference on learning representations.
Kong, Z., Ping, W., Huang, J., Zhao, K., & Catanzaro, B. (2021). DiffWave: A ver-
satile diffusion model for audio synthesis. In International conference on learn-
ing representations.
Li, Y., Ku, B., Zhang, S., Ahn, J.-K., & Ko, H. (2020). Seismic data augmentation
based on conditional generative adversarial networks. Sensors, 20 (23), 6850.
Li, Z., Meier, M.-A., Hauksson, E., Zhan, Z., & Andrews, J. (2018). Machine learn-
ing seismic wave discrimination: Application to earthquake early warning.
Geophysical Research Letters, 45 (10), 4773–4779.
Lilienkamp, H., von Specht, S., Weatherill, G., Caire, G., & Cotton, F. (2022).
Ground-motion modeling as an image processing task: Introducing a neural
network based, fully data-driven, and nonergodic approach. Bulletin of the
Seismological Society of America, 112 (3), 1565–1582.
Luco, N., & Bazzurro, P. (2007). Does amplitude scaling of ground motion records
result in biased nonlinear structural drift responses? Earthquake Engineering
& Structural Dynamics, 36 (13), 1813–1835.
Maechling, P. J., Silva, F., Callaghan, S., & Jordan, T. H. (2015). Scec broadband
platform: System architecture and software implementation. Seismological Re-
search Letters, 86 (1), 27–38.
Mai, P. M., & Beroza, G. (2002). A spatial random field model to characterize
complexity in earthquake slip. Journal of Geophysical Research: Solid Earth,
107 (B11), ESE–10.
Mai, P. M., & Beroza, G. (2003). A hybrid method for calculating near-source,

–32–
manuscript submitted to JGR: Machine Learning and Computation

broadband seismograms: Application to strong motion prediction. Physics of


the Earth and Planetary Interiors, 137 (1-4), 183–199.
Mai, P. M., Imperatori, W., & Olsen, K. B. (2010). Hybrid broadband ground-
motion simulations: Combining long-period deterministic synthetics with
high-frequency multiple S-to-S backscattering. Bull Seismol Soc Am, 100 (5A),
2124–2142. doi: 10.1785/0120080194
Marano, G. C., Rosso, M. M., Aloisio, A., & Cirrincione, G. (2024). Generative ad-
versarial networks review in earthquake-related engineering fields. Bulletin of
Earthquake Engineering, 22 (7), 3511–3562.
Matsumoto, Y., Yaoyama, T., Lee, S., Hida, T., & Itoi, T. (2024). Generative
Adversarial Networks-Based Ground-Motion Model for Crustal Earthquakes
in Japan Considering Detailed Site Conditions. Bulletin of the Seismological
Society of America.
Nichol, A. Q., & Dhariwal, P. (2021). Improved denoising diffusion probabilistic
models. In International conference on machine learning (pp. 8162–8171).
NIED. (2019). K-net, kik-net, national research institute for earth science and disas-
ter resilience. doi: 10.17598/NIED.0004
Okazaki, T., Hachiya, H., Iwaki, A., Maeda, T., Fujiwara, H., & Ueda, N. (2021).
Simulation of broad-band ground motions with consistent long-period and
short-period components using the wasserstein interpolation of acceleration
envelopes. Geophysical Journal International , 227 (1), 333–349.
Olsen, K., & Takedatsu, R. (2015). The sdsu broadband ground-motion generation
module bbtoolbox version 1.5. Seismological Research Letters, 86 (1), 81–88.
Palgunadi, K. H., Gabriel, A.-A., Garagash, D. I., Ulrich, T., & Mai, P. M. (2024).
Rupture dynamics of cascading earthquakes in a multiscale fracture network.
Journal of Geophysical Research: Solid Earth, 129 (3), e2023JB027578.
Paolucci, R., Gatti, F., Infantino, M., Smerzini, C., Özcebe, A. G., & Stupazzini, M.
(2018). Broadband ground motions from 3d physics-based numerical simula-
tions using artificial neural networks. Bulletin of the Seismological Society of
America, 108 (3A), 1272–1286.
Paolucci, R., Mazzieri, I., Piunno, G., Smerzini, C., Vanini, M., & Özcebe,
A. (2021). Earthquake ground motion modeling of induced seis-
micity in the Groningen gas field. Earthquake Engineering & Struc-
tural Dynamics, 50 (1), 135–154. Retrieved 2024-10-17, from https://
onlinelibrary.wiley.com/doi/abs/10.1002/eqe.3367 ( eprint:
https://onlinelibrary.wiley.com/doi/pdf/10.1002/eqe.3367) doi: 10.1002/
eqe.3367
Paolucci, R., Smerzini, C., & Vanini, M. (2021, July). BB-SPEEDset: A Val-
idated Dataset of Broadband Near-Source Earthquake Ground Motions
from 3D Physics-Based Numerical Simulations. Bulletin of the Seismo-
logical Society of America, 111 (5), 2527–2545. Retrieved 2024-10-15, from
https://doi.org/10.1785/0120210089 doi: 10.1785/0120210089
Perraudin, N., Balazs, P., & Søndergaard, P. L. (2013). A fast Griffin-Lim algo-
rithm. In 2013 ieee workshop on applications of signal processing to audio and
acoustics (pp. 1–4).
Rezende, D., & Mohamed, S. (2015). Variational inference with normalizing flows.
In International conference on machine learning.
Rodgers, A. J., Pitarka, A., Pankajakshan, R., Sjögreen, B., & Petersson, N. A.
(2020). Regional-scale 3d ground-motion simulations of mw 7 earthquakes on
the hayward fault, northern california resolving frequencies 0–10 hz and includ-
ing site-response corrections. Bulletin of the Seismological Society of America,
110 (6), 2862–2881.
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2022). High-
resolution image synthesis with latent diffusion models. In Proceedings of the
ieee/cvf conference on computer vision and pattern recognition.

–33–
manuscript submitted to JGR: Machine Learning and Computation

Saikia, C. K., & Somerville, P. (1997). Simulated hard-rock motions in saint louis,
missouri, from large new madrid earthquakes (mw≥ 6.5). Bulletin of the Seis-
mological Society of America, 87 (1), 123–139.
Savran, W., & Olsen, K. (2019). Ground motion simulation and validation of the
2008 chino hills earthquake in scattering media. Geophysical Journal Interna-
tional , 219 (3), 1836–1850.
Shi, Y., Lavrentiadis, G., Asimaki, D., Ross, Z. E., & Azizzadenesheli, K. (2024).
Broadband ground-motion synthesis via generative adversarial neural oper-
ators: Development and validation. Bulletin of the Seismological Society of
America, 114 (4), 2151–2171.
Smerzini, C., Amendola, C., Paolucci, R., & Bazrafshan, A. (2024, February).
Engineering validation of BB-SPEEDset, a data set of near-source physics-
based simulated accelerograms. Earthquake Spectra, 40 (1), 420–445. Re-
trieved 2024-10-15, from http://journals.sagepub.com/doi/10.1177/
87552930231206766 doi: 10.1177/87552930231206766
Song, Y., & Ermon, S. (2019). Generative modeling by estimating gradients of the
data distribution. In Advances in neural information processing systems.
Song, Y., Sohl-Dickstein, J., Kingma, D. P., Kumar, A., Ermon, S., & Poole, B.
(2021). Score-based generative modeling through stochastic differential equa-
tions. In International conference on learning representations.
Tang, Y., & Mai, P. M. (2023). Stochastic ground-motion simulation of the 2021 m
w 5.9 woods point earthquake: Facilitating local probabilistic seismic hazard
analysis in australia. Bulletin of the Seismological Society of America, 113 (5),
2119–2143.
thispersondoesnotexist.com. (2023). This person does not exist. https://
thispersondoesnotexist.com. (Accessed: 2024-10-16)
Touhami, S., Gatti, F., Lopez-Caballero, F., Cottereau, R., de Abreu Corrêa, L.,
Aubry, L., & Clouteau, D. (2022). Sem3d: A 3d high-fidelity numerical
earthquake simulator for broadband (0–10 hz) seismic response prediction at a
regional scale. Geosciences, 12 (3), 112.
van Ede, M. C., Molinari, I., Imperatori, W., Kissling, E., Baron, J., & Morelli, A.
(2020). Hybrid broadband seismograms for seismic shaking scenarios: An ap-
plication to the po plain sedimentary basin (northern italy). Pure and Applied
Geophysics, 177 (5), 2181–2198.
Vincent, P. (2011). A connection between score matching and denoising autoen-
coders. Neural Computation, 23 (7), 1661–1674.
Wang, T., Trugman, D., & Lin, Y. (2021). Seismogen: Seismic waveform synthesis
using gan with application to seismic data augmentation. Journal of Geophysi-
cal Research: Solid Earth, 126 (4), e2020JB020077.
Woollam, J., Münchmeyer, J., Tilmann, F., Rietbrock, A., Lange, D., Bornstein, T.,
. . . others (2022). Seisbench—a toolbox for machine learning in seismology.
Seismological Society of America, 93 (3), 1695–1709.

–34–
manuscript submitted to JGR: Machine Learning and Computation

Supporting Information for ”High Resolution Seismic


Waveform Generation using Denoising Diffusion”

The content of the supplementary figures is listed as follows:

1. Figures S1 to S42

Introduction
This supplementary material provides additional information and figures to com-
plement the main content of the primary text. The aim is to offer a deeper understand-
ing and further validate the presented results. This document includes visual represen-
tations that support and enhance the findings discussed in the main text.
Additional figures for evaluation metrics for the generative waveform model (GWM)
and the real data for different bins of magnitudes, hypocentral distances, faulting type,
and VS30 :

1. Time domain signal envelopes.


2. Fourier spectral amplitude.
3. Shaking duration statistic for all magnitudes
4. Pseudo-spectral acceleration for period T = 0.1, 0.3 and 1.0 s versus distances.
5. Pseudo-spectral acceleration versus magnitudes.
6. Pseudo-spectral acceleration versus VS30 .
7. Average model probabilities across magnitude and distance bin.
8. Residual of spectral mean amplitude.

Figures:
40 60 80 120 150 200

35000
17108 13715 6524 2921 1263 291
30000
25902 16449 6887 2475 1050 200
25000
Distance bin [km]

35490 19040 6977 2223 917 191 20000


11935 5797 2122 633 297 48 15000
6394 2902 1060 358 169 22 10000
5000
3737 1282 510 156 78 19
0

4.5 5 5.5 6 6.5 7 7.5


Magnitude bin

Figure S1: Number of samples in each magnitude-distance bin for all of the following bin
plots. Predicted and target denote the generative waveform model (GWM) and real data.

–35–
manuscript submitted to JGR: Machine Learning and Computation

Predicted Target
0.0 0.0
2.5 2.5

Log-Amplitude [m/s2]

Log-Amplitude [m/s2]
5.0 5.0

0-40 km
7.5 7.5
10.0 10.0
12.5 12.5
15.0 15.0
0 10 20 30 40 0 10 20 30 40
Time [s] Time [s]
0.0 0.0
2.5 2.5
Log-Amplitude [m/s2]

Log-Amplitude [m/s2]
5.0 5.0
40-60 km

7.5 7.5
10.0 10.0
12.5 12.5
15.0 15.0
0 10 20 30 40 0 10 20 30 40
Time [s] Time [s]
0.0 0.0
2.5 2.5
Log-Amplitude [m/s2]

Log-Amplitude [m/s2]
5.0 5.0
60-80 km

7.5 7.5
10.0 10.0
12.5 12.5
15.0 15.0
0 10 20 30 40 0 10 20 30 40
Time [s] Time [s]
0.0 0.0
2.5 2.5
Log-Amplitude [m/s2]

Log-Amplitude [m/s2]
80-120 km

5.0 5.0
7.5 7.5
10.0 10.0
12.5 12.5
15.0 15.0
0 10 20 30 40 0 10 20 30 40
Time [s] Time [s]
0.0 0.0
2.5 2.5
Log-Amplitude [m/s2]

Log-Amplitude [m/s2]
120-150 km

5.0 5.0
7.5 7.5
10.0 10.0
12.5 12.5
15.0 15.0
0 10 20 30 40 0 10 20 30 40
Time [s] Time [s]
0.0 0.0
2.5 2.5
Log-Amplitude [m/s2]

Log-Amplitude [m/s2]
150-200 km

5.0 5.0
7.5 7.5
10.0 10.0
12.5 12.5
15.0 15.0
0 10 20 30 40 0 10 20 30 40
Time [s] Time [s]
Magnitude bins
4.5-5 5-5.5 5.5-6 6-6.5 6.5-7 7-7.5

Figure S2: Distribution of time-domain envelopes for East-West-component seismograms


in different magnitude and distance bins. Predicted and target denote the generative
waveform model (GWM) and real data.

–36–
manuscript submitted to JGR: Machine Learning and Computation

Predicted Target
0.0 0.0
2.5 2.5

Log-Amplitude [m/s2]

Log-Amplitude [m/s2]
5.0 5.0

0-40 km
7.5 7.5
10.0 10.0
12.5 12.5
15.0 15.0
0 10 20 30 40 0 10 20 30 40
Time [s] Time [s]
0.0 0.0
2.5 2.5
Log-Amplitude [m/s2]

Log-Amplitude [m/s2]
5.0 5.0
40-60 km

7.5 7.5
10.0 10.0
12.5 12.5
15.0 15.0
0 10 20 30 40 0 10 20 30 40
Time [s] Time [s]
0.0 0.0
2.5 2.5
Log-Amplitude [m/s2]

Log-Amplitude [m/s2]
5.0 5.0
60-80 km

7.5 7.5
10.0 10.0
12.5 12.5
15.0 15.0
0 10 20 30 40 0 10 20 30 40
Time [s] Time [s]
0.0 0.0
2.5 2.5
Log-Amplitude [m/s2]

Log-Amplitude [m/s2]
80-120 km

5.0 5.0
7.5 7.5
10.0 10.0
12.5 12.5
15.0 15.0
0 10 20 30 40 0 10 20 30 40
Time [s] Time [s]
0.0 0.0
2.5 2.5
Log-Amplitude [m/s2]

Log-Amplitude [m/s2]
120-150 km

5.0 5.0
7.5 7.5
10.0 10.0
12.5 12.5
15.0 15.0
0 10 20 30 40 0 10 20 30 40
Time [s] Time [s]
0.0 0.0
2.5 2.5
Log-Amplitude [m/s2]

Log-Amplitude [m/s2]
150-200 km

5.0 5.0
7.5 7.5
10.0 10.0
12.5 12.5
15.0 15.0
0 10 20 30 40 0 10 20 30 40
Time [s] Time [s]
Magnitude bins
4.5-5 5-5.5 5.5-6 6-6.5 6.5-7 7-7.5

Figure S3: Distribution of time-domain envelopes for North-South-component seismo-


grams in different magnitude and distance bins. Predicted and target denote the genera-
tive waveform model (GWM) and real data.

–37–
manuscript submitted to JGR: Machine Learning and Computation

Predicted Target
0 0
2 2

Log-Amplitude [m/s2]

Log-Amplitude [m/s2]
4 4

0-40 km
6 6
8 8
10 10
12 12
14 14
0 10 20 30 40 0 10 20 30 40
Time [s] Time [s]
0 0
2 2
Log-Amplitude [m/s2]

Log-Amplitude [m/s2]
4 4
40-60 km

6 6
8 8
10 10
12 12
14 14
0 10 20 30 40 0 10 20 30 40
Time [s] Time [s]
0 0
2 2
Log-Amplitude [m/s2]

4 Log-Amplitude [m/s2] 4
60-80 km

6 6
8 8
10 10
12 12
14 14
0 10 20 30 40 0 10 20 30 40
Time [s] Time [s]
0 0
2 2
Log-Amplitude [m/s2]

Log-Amplitude [m/s2]

4 4
80-120 km

6 6
8 8
10 10
12 12
14 14
0 10 20 30 40 0 10 20 30 40
Time [s] Time [s]
0 0
2 2
Log-Amplitude [m/s2]

Log-Amplitude [m/s2]

4 4
120-150 km

6 6
8 8
10 10
12 12
14 14
0 10 20 30 40 0 10 20 30 40
Time [s] Time [s]
0 0
2 2
Log-Amplitude [m/s2]

Log-Amplitude [m/s2]

4 4
150-200 km

6 6
8 8
10 10
12 12
14 14
0 10 20 30 40 0 10 20 30 40
Time [s] Time [s]
Magnitude bins
4.5-5 5-5.5 5.5-6 6-6.5 6.5-7 7-7.5

Figure S4: Distribution of time-domain envelopes for vertical-component seismograms in


different magnitude and distance bins. Predicted and target denote the generative wave-
form model (GWM) and real data.

–38–
manuscript submitted to JGR: Machine Learning and Computation

Predicted Target
5.0 5.0

Log-Amplitude [m/s2 Hz 1]

Log-Amplitude [m/s2 Hz 1]
2.5 2.5
0.0 0.0

0-40 km
2.5 2.5
5.0 5.0
7.5 7.5
10.0 10.0
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
5.0 5.0
Log-Amplitude [m/s2 Hz 1]

Log-Amplitude [m/s2 Hz 1]
2.5 2.5
0.0 0.0
40-60 km

2.5 2.5
5.0 5.0
7.5 7.5
10.0 10.0
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
5.0 5.0
Log-Amplitude [m/s2 Hz 1]

Log-Amplitude [m/s2 Hz 1]
2.5 2.5
0.0 0.0
60-80 km

2.5 2.5
5.0 5.0
7.5 7.5
10.0 10.0
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
5.0 5.0
Log-Amplitude [m/s2 Hz 1]

Log-Amplitude [m/s2 Hz 1]

2.5 2.5
80-120 km

0.0 0.0
2.5 2.5
5.0 5.0
7.5 7.5
10.0 10.0
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
5.0 5.0
Log-Amplitude [m/s2 Hz 1]

Log-Amplitude [m/s2 Hz 1]

2.5 2.5
120-150 km

0.0 0.0
2.5 2.5
5.0 5.0
7.5 7.5
10.0 10.0
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
5.0 5.0
Log-Amplitude [m/s2 Hz 1]

Log-Amplitude [m/s2 Hz 1]

2.5 2.5
150-200 km

0.0 0.0
2.5 2.5
5.0 5.0
7.5 7.5
10.0 10.0
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
Magnitude bins
4.5-5 5-5.5 5.5-6 6-6.5 6.5-7 7-7.5

Figure S5: Distribution of Fourier spectra log-amplitudes for East-West-component


seismograms in different magnitude and distance bins. Predicted and target denote the
generative waveform model (GWM) and real data.

–39–
manuscript submitted to JGR: Machine Learning and Computation

Predicted Target
5.0 5.0

Log-Amplitude [m/s2 Hz 1]

Log-Amplitude [m/s2 Hz 1]
2.5 2.5
0.0 0.0

0-40 km
2.5 2.5
5.0 5.0
7.5 7.5
10.0 10.0
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
5.0 5.0
Log-Amplitude [m/s2 Hz 1]

Log-Amplitude [m/s2 Hz 1]
2.5 2.5
0.0 0.0
40-60 km

2.5 2.5
5.0 5.0
7.5 7.5
10.0 10.0
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
5.0 5.0
Log-Amplitude [m/s2 Hz 1]

Log-Amplitude [m/s2 Hz 1]
2.5 2.5
0.0 0.0
60-80 km

2.5 2.5
5.0 5.0
7.5 7.5
10.0 10.0
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
5.0 5.0
Log-Amplitude [m/s2 Hz 1]

Log-Amplitude [m/s2 Hz 1]

2.5 2.5
80-120 km

0.0 0.0
2.5 2.5
5.0 5.0
7.5 7.5
10.0 10.0
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
5.0 5.0
Log-Amplitude [m/s2 Hz 1]

Log-Amplitude [m/s2 Hz 1]

2.5 2.5
120-150 km

0.0 0.0
2.5 2.5
5.0 5.0
7.5 7.5
10.0 10.0
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
5.0 5.0
Log-Amplitude [m/s2 Hz 1]

Log-Amplitude [m/s2 Hz 1]

2.5 2.5
150-200 km

0.0 0.0
2.5 2.5
5.0 5.0
7.5 7.5
10.0 10.0
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
Magnitude bins
4.5-5 5-5.5 5.5-6 6-6.5 6.5-7 7-7.5

Figure S6: Distribution of Fourier spectra log-amplitudes for North-South-component


seismograms in different magnitude and distance bins. Predicted and target denote the
generative waveform model (GWM) and real data.

–40–
manuscript submitted to JGR: Machine Learning and Computation

Predicted Target
5.0 5.0

Log-Amplitude [m/s2 Hz 1]

Log-Amplitude [m/s2 Hz 1]
2.5 2.5
0.0 0.0

0-40 km
2.5 2.5
5.0 5.0
7.5 7.5
10.0 10.0
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
5.0 5.0
Log-Amplitude [m/s2 Hz 1]

Log-Amplitude [m/s2 Hz 1]
2.5 2.5
0.0 0.0
40-60 km

2.5 2.5
5.0 5.0
7.5 7.5
10.0 10.0
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
5.0 5.0
Log-Amplitude [m/s2 Hz 1]

Log-Amplitude [m/s2 Hz 1]
2.5 2.5
0.0 0.0
60-80 km

2.5 2.5
5.0 5.0
7.5 7.5
10.0 10.0
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
5.0 5.0
Log-Amplitude [m/s2 Hz 1]

Log-Amplitude [m/s2 Hz 1]

2.5 2.5
0.0 0.0
80-120 km

2.5 2.5
5.0 5.0
7.5 7.5
10.0 10.0
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
5.0 5.0
Log-Amplitude [m/s2 Hz 1]

Log-Amplitude [m/s2 Hz 1]

2.5 2.5
0.0 0.0
120-150 km

2.5 2.5
5.0 5.0
7.5 7.5
10.0 10.0
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
5.0 5.0
Log-Amplitude [m/s2 Hz 1]

Log-Amplitude [m/s2 Hz 1]

2.5 2.5
0.0 0.0
150-200 km

2.5 2.5
5.0 5.0
7.5 7.5
10.0 10.0
0 10 20 30 40 50 0 10 20 30 40 50
Frequency [Hz] Frequency [Hz]
Magnitude bins
4.5-5 5-5.5 5.5-6 6-6.5 6.5-7 7-7.5

Figure S7: Distribution of Fourier spectra log-amplitudes for vertical-component seismo-


grams in different magnitude and distance bins. Predicted and target denote the genera-
tive waveform model (GWM) and real data.

–41–
manuscript submitted to JGR: Machine Learning and Computation

Figure S8: Shaking duration for all GWM synthetics (red triangles) using one realization
and real data (grey circles) with corresponding conditioning parameters. For each magni-
tude bin (every 0.08) from M 4.5 - 9.0, grey dots and lines show the mean and standard
deviation of the real data, while blue triangles and lines show the mean and standard de-
viation of the GWM synthetics.

Figure S9: RotD50 pseudo-spectral acceleration (SA) with a damping factor 5% versus
hypocentral distance for periods (T ) of 0.1 s and 1.0 s. Median prediction of the GWM
(black line) and standard deviation (yellow shaded area), along with median prediction
(solid lines) and standard deviation (dashed lines) of the Boore et al. (2014) GMM (vio-
let), and the Kanno et al. (2006) GMM (red), using M5 and VS30 = 150 m/s. The data
are sampled from narrow magnitude and VS30 bins, as written in the figure titles, and
shown by their median (green squares) and standard deviations (green lines).

–42–
manuscript submitted to JGR: Machine Learning and Computation

Figure S10: Same as Figure S9 but for the magnitude bin M5 and VS30 = 600 m/s.

Figure S11: Same as Figure S9 but for the magnitude bin M5.5 and VS30 = 150 m/s.

–43–
manuscript submitted to JGR: Machine Learning and Computation

Figure S12: Same as Figure S9 but for the magnitude bin M5.5 and VS30 = 600 m/s.

Figure S13: Same as Figure S9 but for the magnitude bin M6.0 and VS30 = 150 m/s.

–44–
manuscript submitted to JGR: Machine Learning and Computation

Figure S14: Same as Figure S9 but for the magnitude bin M6.0 and VS30 = 600 m/s.

Figure S15: Same as Figure S9 but for the magnitude bin M6.5 and VS30 = 150 m/s.

–45–
manuscript submitted to JGR: Machine Learning and Computation

Figure S16: Same as Figure S9 but for the magnitude bin M6.5 and VS30 = 600 m/s.

Figure S17: Same as Figure S9 but for the magnitude bin M7.0 and VS30 = 150 m/s.

–46–
manuscript submitted to JGR: Machine Learning and Computation

Figure S18: Same as Figure S9 but for the magnitude bin M7.0 and VS30 = 600 m/s.

Figure S19: RotD50 pseudo-spectral acceleration (SA) with a damping factor of 5% ver-
sus magnitude. Median of the GWM prediction (black lines) and its standard deviation
(yellow shaded areas), using R = 15 km. Panels a) and c) show RotD50 pseudo-spectral
acceleration for VS30 = 150 m/s at periods of 0.1 s and 1.0 s, respectively. Panels b) and
d) show RotD50 pseudo-spectral acceleration for VS30 = 600 m/s at periods of 0.1 s and
1.0 s, respectively. The data (grey dots) are sampled from narrow magnitude, R, and VS30
bins, as written in the figure titles.

–47–
manuscript submitted to JGR: Machine Learning and Computation

Figure S20: Same as Figure S19 a distance bin of 40 km.

Figure S21: Same as Figure S19 a distance bin of 60 km.

–48–
manuscript submitted to JGR: Machine Learning and Computation

Figure S22: Same as Figure S19 a distance bin of 80 km.

Figure S23: Same as Figure S19 a distance bin of 100 km.

–49–
manuscript submitted to JGR: Machine Learning and Computation

Figure S24: Same as Figure S19 a distance bin of 130 km.

Figure S25: RotD50 pseudo-spectral acceleration (SA) with a damping factor of 5% ver-
sus magnitude. Median of the GWM prediction (black lines) and its standard deviation
(yellow shaded areas), using R = 15 km. Panels a) and d) show RotD50 pseudo-spectral
acceleration for M 5 m/s at periods of 0.1 s and 1.0 s, respectively. Panels b) and e)
show RotD50 pseudo-spectral acceleration for M 5.5 m/s at periods of 0.1 s and 1.0 s,
respectively. Panels c) and f) show RotD50 pseudo-spectral acceleration for M 6.0 m/s
at periods of 0.1 s and 1.0 s, respectively. The data (grey dots) are sampled from narrow
magnitude, R, and magnitude bins, as written in the figure titles.

–50–
manuscript submitted to JGR: Machine Learning and Computation

Figure S26: Same as Figure S25 but with a distance bin of 40 km.

Figure S27: Same as Figure S25 but with a distance bin of 60 km.

Figure S28: Same as Figure S25 but with a distance bin of 80 km.

–51–
manuscript submitted to JGR: Machine Learning and Computation

Figure S29: Same as Figure S25 but with a distance bin of 100 km.

Figure S30: Same as Figure S25 but with a distance bin of 130 km.

–52–
manuscript submitted to JGR: Machine Learning and Computation

Figure S31: Average model probabilities given the SA data of the ground motion model
(GMM) by (Boore et al., 2014), generative waveform modeling (GWM), and the ratio
between the two distributions given the data as a function of magnitude and recording
distance for VS30 = 240 m/s. Panels a), b), and c) show the model likelihoods and their
ratios at T = 0.1 s, T = 0.3 s, and T = 1.0 s, respectively.

Figure S32: Same as Figure S31 but for GMM model of (Kanno et al., 2006) for VS30 =
240 m/s. Panels a), b), and c) show the model likelihoods and their ratios at T = 0.1 s,
T = 0.3 s, and T = 1.0 s, respectively.

–53–
manuscript submitted to JGR: Machine Learning and Computation

Figure S33: Same as Figure S31 but for VS30 = 520 m/s.

Figure S34: Same as Figure S32 but for VS30 = 520 m/s.

–54–
manuscript submitted to JGR: Machine Learning and Computation

Figure S35: Same as Figure S31 but for VS30 = 800 m/s.

Figure S36: Same as Figure S32 but for VS30 = 800 m/s.

–55–
manuscript submitted to JGR: Machine Learning and Computation

Figure S37: Same as Figure S31 but for VS30 = 1080 m/s.

Figure S38: Same as Figure S32 but for VS30 = 1080 m/s.

–56–
manuscript submitted to JGR: Machine Learning and Computation

Figure S39: Same as Figure S31 but for VS30 = 1360 m/s.

Figure S40: Same as Figure S32 but for VS30 = 1360 m/s.

–57–
manuscript submitted to JGR: Machine Learning and Computation

Figure S41: Residual of the spectral mean amplitude between real data and GWM for all
records.
75 100 125 150 200

75 100 125 150 200

75 100 125 150 200

38 26 32 42 121 28 25 29 30 90 300 36 29 34 34 99 350


300 250 300
Distance bin [km]

Distance bin [km]

Distance bin [km]

45 24 54 60 164 35 26 45 45 130 43 31 49 54 166


250
200
30 34 48 78 238 200 24 32 42 59 187 31 39 50 76 244 200
150
63 74 86 105 203 46 63 71 82 167 57 76 78 90 195 150
100 100 100
77 117 202 162 379 62 98 164 129 326 50 76 119 169 155 372 50
0

4.5 4.75 5 5.25 6 9.1 4.5 4.75 5 5.25 6 9.1 4.5 4.75 5 5.25 6 9.1
Magnitude bin Magnitude bin Magnitude bin

(a) East-West (b) North-South (c) Vertical

Figure S42: Log-amplitude Fourier spectra Fréchet Distance heatmaps for all three com-
ponents in different magnitude and distance bins.

–58–

You might also like