0% found this document useful (0 votes)

18 views16 pages

1 s2.0 S0021999119304504 Main

This paper presents a numerical framework for approximating unknown governing equations using observation data and deep neural networks (DNN), specifically employing residual networks (ResNet) as the foundational building block. The authors introduce two multi-step methods, RT-ResNet and RS-ResNet, which are designed to effectively approximate dynamical systems without requiring time derivative data, thus overcoming challenges associated with noisy or sparse data. The effectiveness of these methods is demonstrated through various numerical examples involving both linear and nonlinear differential equations.

Uploaded by

rzkg5z4nnm

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views16 pages

1 s2.0 S0021999119304504 Main

Uploaded by

rzkg5z4nnm

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

Journal of Computational Physics 395 (2019) 620–635

Contents lists available at ScienceDirect

Journal of Computational Physics

www.elsevier.com/locate/jcp

Data driven governing equations approximation using deep

neural networks ✩
Tong Qin, Kailiang Wu, Dongbin Xiu ∗
Department of Mathematics, The Ohio State University, Columbus, OH 43210, USA

a r t i c l e i n f o a b s t r a c t

Article history: We present a numerical framework for approximating unknown governing equations using
Received 13 November 2018 observation data and deep neural networks (DNN). In particular, we propose to use
Received in revised form 22 February 2019 residual network (ResNet) as the basic building block for equation approximation. We
Accepted 15 June 2019
demonstrate that the ResNet block can be considered as a one-step method that is exact
Available online 19 June 2019
in temporal integration. We then present two multi-step methods, recurrent ResNet (RT-
Keywords: ResNet) method and recursive ReNet (RS-ResNet) method. The RT-ResNet is a multi-step
Deep neural network method on uniform time steps, whereas the RS-ResNet is an adaptive multi-step method
Residual network using variable time steps. All three methods presented here are based on integral form of
Recurrent neural network the underlying dynamical system. As a result, they do not require time derivative data for
Governing equation discovery equation recovery and can cope with relatively coarsely distributed trajectory data. Several
numerical examples are presented to demonstrate the performance of the methods.
© 2019 Elsevier Inc. All rights reserved.

1. Introduction

Recently there has been a growing interest in discovering governing equations numerically using observational data.
Earlier efforts include methods using symbolic regression ([5,43]), equation-free modeling [24], heterogeneous multi-scale
method (HMM) ([15]), artiﬁcial neural networks ([19]), nonlinear regression ([50]), empirical dynamic modeling ([46,53]),
nonlinear Laplacian spectral analysis ([18]), automated inference of dynamics ([44,12,13]), etc. More recent efforts start to
cast the problem into a function approximation problem, where the unknown governing equations are treated as target
functions relating the data for the state variables and their time derivatives. The majority of the methods employ certain
sparsity-promoting algorithms to create parsimonious models from a large set of dictionary for all possible models, so that
the true dynamics could be recovered exactly ([47]). Many studies have been conducted to effectively deal with noises in
data ([7,41]), corruptions in data ([48]), partial differential equations [38,40], etc. Methods have also been developed in
conjunction with model selection approach ([28]), Koopman theory ([6]), and Gaussian process regression ([35]), to name a
few. A more recent work resorts to the more traditional means of approximation by using orthogonal polynomials ([52]). The
approach seeks accurate numerical approximation to the underlying governing equations, instead of their exact recovery. By
doing so, many existing results in polynomial approximation theory can be applied, particularly those on sampling strategies.
It was shown in [52] that data from a large number of short bursts of trajectories are more effective for equation recovery
than those from a single long trajectory.

✩
This work was partially supported by AFOSR FA9550-18-1-0102.
* Corresponding author.
E-mail addresses: [email protected] (T. Qin), [email protected] (K. Wu), [email protected] (D. Xiu).

https://doi.org/10.1016/j.jcp.2019.06.042
0021-9991/© 2019 Elsevier Inc. All rights reserved.
T. Qin et al. / Journal of Computational Physics 395 (2019) 620–635 621

On the other hand, artificial neural network (ANN), and particularly deep neural network (DNN), has seen tremendous
successes in many different disciplines. The number of publications is too large to mention. Here we cite only a few rela-
tively more recent review/summary type publications ([30,4,16,32,14,20,42]). Efforts have been devoted to the use of ANN
for various aspects of scientific computing, including construction of reduced order model ([22]), aiding solution of con-
servation laws ([37]), multiscale problems ([8,51]), solving and learning systems involving ODEs and PDEs ([29,11,27,25]),
uncertainty quantification ([49,54]), etc.
The focus of this paper is on the approximation/learning of dynamical systems using deep neural networks (DNN). The
topic has been explored in a series of recent articles, in the context of ODEs ([36,39]) and PDEs ([34,33,27]). The new contri-
butions of this paper include the following. First, we introduce new constructions of deep neural network (DNN), specifically
suited for learning dynamical systems. In particular, our new network structures employ residual network (ResNet), which
was first proposed in [21] for image analysis and has become very popular due to its effectiveness. In our construction,
we employ a ResNet block, which consists of multiple fully connected hidden layers, as the fundamental building block of
our DNN structures. We show that the ResNet block can be considered as a one-step numerical integrator in time. This
integrator is “exact” in time, i.e., no temporal error, in the sense that the only error stems from the neural network ap-
proximation of the evolution operators defining the governing equation. This is different from a few existing work where
ResNet is viewed as the Euler forward scheme ([9]). Secondly, we introduce two variations of the ResNet structure to serve
as multi-step learning of the underlying governing equations. The first one employs recurrent use of the ResNet block.
This is inspired by the well known recurrent neural network (RNN), whose connection with dynamical systems has long
been recognized, cf. [20]. Our recurrent network, termed RT-ResNet hereafter, is different in the sense that the recurrence
is enforced blockwise on the ResNet block, which by itself is a DNN. (Note that in the traditional RNN, the recurrence is
enforced on the hidden layers.) We show that the RT-ResNet is a multi-step integrator that is exact in time, with the only
error stemming from the ResNet approximation of the evolution operator of the underlying equation. The other variation
of the ResNet approximator employs recursive use of the ResNet block, termed RS-ResNet. Again, the recursion is enforced
blockwise on the ResNet block (which is a DNN). We show that the RS-ResNet is also an exact multi-step integrator. The dif-
ference between RT-ResNet and RS-ResNet is that the former is equivalent to a multi-step integrator using an uniform time
step, whereas the latter is an “adaptive” method with variable time steps depending on the particular problem and data.
Thirdly, the derivations in this paper utilize integral form of the underlying dynamical system. By doing so, the proposed
methods do not require knowledge or data of the time derivatives of the equation states. This is different from most of the
existing studies (cf. [5,7,41,52]), which deal with the equations directly and thus require time derivative data. Acquiring time
derivatives introduces an additional source for noises and errors, particularly when one has to conduct numerical differen-
tiation of noisy trajectory data. Consequently, the proposed three new DNN structures, the one-step ResNet and multi-step
RT-ResNet and RS-ResNet, are capable of approximating unknown dynamical systems using only state variable data, which
could be relatively coarsely distributed in time. In this case, most of the existing methods become less effective, as accurate
extraction of time derivatives is difficult.
This paper is organized as follows. After the basic problem setup in Section 2, we present the main methods in Section 3
and some theoretical properties in Section 4. We then present, in Section 5, a set of numerical examples, covering both
linear and nonlinear differential equations, to demonstrate the effectiveness of the proposed algorithms.

2. Setup

Let us consider an autonomous system

dx
= f(x), x(t 0 ) = x0 , (2.1)
dt
where x ∈ Rn are the state variables. Let s : Rn → Rn be the ﬂow map, which maps the state from t = 0 to the state at
t = s. Note that for autonomous systems the time variable t can be arbitrarily shifted and only the time difference, or time
lag, t − t 0 is relevant. The solution can be written as

x(t ; x0 , t 0 ) = t −t0 (x0 ). (2.2)

Hereafter we will omit t in the exposition, unless confusion arises.
In this paper, we assume the form of the governing equations f : Rn → Rn is unknown. Our goal is to create an accurate
model for the governing equation using data of the solution trajectories. In particular, we assume data are collected in the
form of pairs, each of which corresponds to the solution states along one trajectory at two different time instances. That is,
we consider the set
(1 ) (2 )
S = {(z j , z j ) : j = 1, . . . , J }, (2.3)

where J is the total number of data pairs, and for each pair j = 1, . . . , J ,
(1 )
zj = x j + (j1) , zj
(2 )
= j (x j ) + (j2) . (2.4)
622 T. Qin et al. / Journal of Computational Physics 395 (2019) 620–635

(1) (2)
Here the terms j and j stand for the potential noises in the data, and j is the time lag between the two states.
For notational convenience, we assume j = to be a constant for all j throughout this paper. Consequently, the data set
becomes input-output measurements of the -lag ﬂow map,

x → (x). (2.5)

3. Deep neural network approximation

The core building block of our methods is a standard fully connected feedforward neural network (FNN) with M ≥ 3
layers, of which ( M − 2) are hidden layers. It has been established that fully connected FNN can approximate arbitrarily
well a large class of input-output maps, i.e., they are universal approximators, cf. [31,2,23]. Since the right-hand-side f of
(2.1) is our approximation goal, we will consider Rn → Rn map. Let n j , j = 1, . . . , M, be the number of neurons in each
layer, we then have n1 = n M = n.
Let N : Rn → Rn be the operator of this network. For any input yin ∈ Rn , the output of the network is

yout = N(yin ; ), (3.1)

where is the parameter set including all the parameters in the network. The operator N is a composition of the following
operators

N(·; ) = (σ M ◦ W M −1 ) ◦ · · · ◦ (σ2 ◦ W1 ), (3.2)

where ◦ stands for operator composition, W j is the weight matrix containing the weight parameters connecting the neurons
from j-th layer to ( j + 1)-th layer, after using the standard approach of augmenting the biases into the weights, and
σ j : R → R is the activation function, which is applied component-wise to the j-th layer. There exist many choices for
the activation functions, e.g., sigmoid functions, ReLU (rectiﬁed linear unit), etc. In this paper we use a sigmoid function,
in particular, the σi (x) = tanh(x) function, in all layers, except at the output layer σ M (x) = x. This is one of the common
choices for DNN.
Using the data set (2.3), we can directly train (3.1) to approximate the -lag ﬂow map (2.5). This can be done by applying
(1)
(3.1) with yin
j
= z j to obtain yout
j
for each j = 1, . . . , J , and then minimizing following mean squared loss function

1
J
out (2 ) 2
L () = y j − z j , (3.3)
J
j =1

where · denotes vector 2-norm hereafter. With a slight abuse of notation, hereafter we will write yin = z(1) to stand for
(1)
yin
j
= z j for all sample data j = 1, . . . , J , unless confusion arises otherwise.

3.1. One-step ResNet approximation

We now present the idea of using residual neural network (ResNet) as a one-step approximation method. The idea of
ResNet is to explicitly introduce the identity operator in the network and force the network to effectively approximate the
“residue” of the input-output map. Although mathematically equivalent, this simple transformation has been shown to be
highly advantageous in practice and become increasingly popular, after its formal introduction in [21].
The structure of the ResNet is illustrated in Fig. 3.1. The ResNet block consists of N fully connected hidden layers and an
identity operator to re-introduce the input yin back into the output of the hidden layers. The introduction of the identity
operator effectively produces the following mapping

yout = yin + N(yin ; ), yin = z(1) , (3.4)

where are the weight and bias parameters in the network. The parameters are determined by minimizing the same loss
function (3.3). This effectively accomplishes the training of the operator N(·; ).
The connection between dynamical systems and ResNet has been recognized. In fact, ResNet has been viewed as the
Euler forward time integrator ([9]). To further examine its property, let us consider the exact -lag ﬂow map,

x() = (x(0))

= x(0) + f(x(t ))dt
(3.5)
0

= x(0) + · f(x(τ ))
= x(0) + · f(τ (x(0))), 0 ≤ τ ≤ .
This is a trivial derivation using the mean value theorem. For notational convenience, we now deﬁne “effective increment”.
T. Qin et al. / Journal of Computational Physics 395 (2019) 620–635 623

Fig. 3.1. Schematic of the ResNet structure for one-step approximation.

Deﬁnition 3.1. For a given autonomous system (2.1), given an initial state x and an increment ≥ 0, then its effective
increment of size is deﬁned as

φ (x; f) = · f(τ (x)), (3.6)

for some 0 ≤ τ ≤ such that

x() = x + φ (x; f). (3.7)

Note that the effective increment φ depends only on its initial state x, once the governing equation f and the increment
are ﬁxed.
Upon comparing the exact state (3.7) and the one-step ResNet method (3.4), it is thus easy to see that a successfully
trained network operator N is an approximation to the effective increment φ , i.e.,

N(x; ) ≈ φ (x; f). (3.8)

Since the effective increment completely determines the true solution states on a interval, we can then use the ResNet
operator to approximate the solution trajectory. That is, starting with a given initial state y(0) = x, we can time march the
state

y(k+1) = y(k) + N(y(k) ; ), k = 0, . . . . (3.9)

This discrete dynamical system serves as our approximation to the true dynamical system (2.1). It gives us an approximation
to the true states on a uniform time grids with stepsize .

Remark 3.1. Even though the approximate system (3.9) resembles the well known Euler forward time stepping scheme, it is
not a ﬁrst-order method in time. In fact, upon comparing (3.9) and the true state (3.7), it is easy to see that (3.9) is “exact”
in term of temporal integration. The only source of error in the system (3.9) is the approximation error of the effective
increment in (3.8). The size of this error is determined by the quality of the data and the network training algorithm.

Remark 3.2. The derivation here is based on (3.7), which is from the integral form of the governing equation. As a result,
training of the ResNet method does not require data on the time derivatives of the true states. Moreover, does not need
to be exceedingly small (to enable accurate numerical differentiation in time). This makes the ResNet method suitable for
problems with relatively coarsely distributed data.

3.2. Multi-step recurrent ResNet (RT-ResNet) approximation

We now combine the idea of recurrent neural network (RNN) and the ResNet method from the previous section. The
distinct feature of our construction is that the recurrence is applied to the entire ResNet block, rather than to the individual
hidden layers, as is done for the standard RNNs. (For an overview of RNN, interested readers are referred to [20], Ch. 10.)
The structure of the resulting Recurrent ResNet (RT-ResNet) is shown in Fig. 3.2. The ResNet block, as presented in
Fig. 3.1, is “repeated” (K − 1) times, for an integer K ≥ 1, before producing the output yout . This makes the occurrence of
the ResNet block a total of K times. The unfolded structure is shown on the right of Fig. 3.2. The RT-ResNet then produces
the following scheme, for K ≥ 1,
624 T. Qin et al. / Journal of Computational Physics 395 (2019) 620–635

Fig. 3.2. Schematic of the recurrent ResNet (RT-ResNet) structure for multi-step approximation (K ≥ 1).

⎧
⎪
⎪ y0 = z(1) ,
⎨
yk+1 = yk + N(yk ; ), k = 0, . . . , K − 1 , (3.10)
⎪
⎪
⎩ yout = y .
K

The network is then trained by using the data set (2.3) and minimizing the same loss function (3.3). For K = 1, this reduces
to the one-step ResNet method (3.4).
To examine the properties of the RT-ResNet, let us consider a unform discretization of the time lag . That is, let
δ = / K , and consider, tk = kδ , k = 0, . . . , K . The exact solution state x satisﬁes the following relation
⎧
⎪ x(t 0 ) = x(0),
⎨
x(tk+1 ) = x(tk ) + φ δ (x(tk ); f), k = 0, . . . , K − 1 , (3.11)
⎪
⎩
x() = x(t K ),

where φ δ (x; f) is the effective increment defined of size δ , as defined in Definition 3.1.
Upon comparing this with the RT-ResNet scheme (3.10), it is easy to see that training the RT-ResNet is equivalent to
finding the operator N to approximate the δ -effective increment,

N(x; ) ≈ φ δ (x; f). (3.12)

Similar to the one-step ResNet method, the multi-step RT-ResNet is also exact in time, as it contains no temporal discretiza-
tion error. The only error stems from the approximation of the δ -effective increment.
Once the RT-ResNet is successfully trained, it gives us a discrete dynamical system (3.10) that can be further marched in
time using any initial state. This is an approximation to the true dynamical system on uniformly distributed time instances
with an interval δ = / K . Therefore, even though the training data are given over time interval, the RT-ResNet system
can produce solution states on ﬁner time grids with a step size δ ≤ (K ≥ 1).

3.3. Multi-step recursive ResNet approximation

We now present another multi-step approximation method based on the ResNet block in Fig. 3.1. The structure of the
network is shown in Fig. 3.3. From the input yin , ResNet blocks are recursively used a total of K ≥ 1 times, before producing
the output yout . The network, referred to as recursive ResNet (RS-ResNet) hereafter, thus produces the following scheme, for
any K ≥ 1,
⎧
⎪
⎪ y0 = z(1) ,
⎨
yk+1 = yk + N(yk ; k ), k = 0, . . . , K − 1, (3.13)
⎪
⎪
⎩ yout = y .
K

Compared to the recurrent RT-ResNet method (3.10) from the previous section, the major difference in RS-ResNet is that
each ResNet block inside the network has its own parameter sets k and thus are different from each other. Since each
ResNet is a DNN by itself, the RS-ResNet can be a very deep network when K > 1. When K = 1, it also reduces back to the
one-step ResNet network.
Let 0 = t 0 < t 1 < · · · < t K = be an arbitrarily distributed time instances in [0, ] and δk = tk+1 − tk , k = 0, . . . , K − 1,
be the (non-uniform) increments. It is then straightforward to see that the exact state satisﬁes
⎧
⎪ x(t 0 ) = x(0),
⎨
x(tk+1 ) = x(tk ) + φ δk (x(tk ); f), k = 0, . . . , K − 1 , (3.14)
⎪
⎩
x() = x(t K ),
T. Qin et al. / Journal of Computational Physics 395 (2019) 620–635 625

Fig. 3.3. Schematic of the recursive ResNet (RS-ResNet) structure for multi-step approximation (K ≥ 1).

where φ δk (x; f) is the δk effective increment deﬁned in Deﬁnition 3.1.

Upon comparing with the RS-ResNet scheme (3.13), one can see that the training of the RS-ResNet produces the following
approximation

N(x; k ) ≈ φ δk (x; f), k = 0, . . . , K − 1 . (3.15)

That is, each ResNet operator N(x; k ) is an approximation of an effective increment of size δk , for k = 0, . . . , K − 1, under
K −1
the condition k=0 δk = . Training the network using the data (2.3) and loss function (3.3) will determine the parameter
sets k , and subsequently the effective increments with size δk , for k = 0, . . . , K − 1, From this perspective, one may view
RS-ResNet as an “adaptive” method, as it adjusts its parameter sets to approximate K smaller effective increments whose
increments are determined by the data. Since RS-ResNet is a very deep network with a large number of parameters, it is, in
principle, capable of producing more accurate results than ResNet and RT-ResNet, assuming cautions have been exercised to
prevent overﬁtting.
A successfully trained RS-ResNet also gives us a discrete dynamical system that approximates the true governing equation
(2.1). Due to its “adaptive” nature, the intermediate time intervals δk are variables and not known explicitly. Therefore, the
discrete RS-ResNet needs to be applied K times to produce the solution states over the time interval , which is the same
interval given by the training data. This is different from the RT-ResNet, which can produce solutions over a smaller and
uniform time interval δ = / K .

4. Theoretical properties

In this section we present a few straightforward analysis to demonstrate certain theoretical aspects of the proposed DNN
for equation approximation.

4.1. Continuity of ﬂow map

Under certain conditions on f, one can show that the ﬂow map of the dynamical system (2.1) is locally Lipschitz contin-
uous.

Lemma 4.1. Assume f is Lipschitz continuous with Lipschitz constant L on a set D ⊆ Rn . For any τ > 0, deﬁne

D τ := x0 ∈ D : t (x0 ) ∈ D , ∀t ∈ [0, τ ] .

Then, for any t ∈ [0, τ ], the ﬂow map t is Lipschitz continuous on D τ . Speciﬁcally, for any x0 , x0 ∈ D τ ,

t (x0 ) − t (x0 ) ≤ e Lt x0 − x0 , ∀t ∈ [0, τ ]. (4.1)

Proof. The proof directly follows from the classical result on the continuity of the dynamical system (2.1) with respect to
initial data; see [45, p. 109]. 2

The above continuity ensures that the ﬂow map can be approximated by neural networks to any desired degree of
accuracy by increasing the number of hidden layers and neurons; see, for example, [26,31]. The Lipschitz continuity will
also play an important role in the error analysis in Theorem 4.3.

4.2. Compositions of ﬂow maps

It was shown in [3] that any smooth bi-Lipschitz function can be represented as compositions of functions, each of which
is near-identity in Lipschitz semi-norm. For the ﬂow map of the autonomous system (2.1), we can prove a stronger result
by using the following property

t1 ◦ t2 = t1 +t2 , ∀t 1 , t 2 . (4.2)

626 T. Qin et al. / Journal of Computational Physics 395 (2019) 620–635

Theorem 4.2. For any positive integer K ≥ 1, the ﬂow map can be expressed as a K -fold composition of δ , namely,

= δ ◦ · · · ◦ δ , (4.3)

K −fold

where δ = / K , and δ satisﬁes

δ (x0 ) − x0 ≤ sup f(t (x0 )), ∀x0 . (4.4)
K t ∈[0,δ]

Suppose that f is bounded on D ⊆ Rn , then

δ − IL ∞ ( D δ ) ≤ fL ∞ ( D ) = O , (4.5)
K K
where I : Rn → Rn is the identity map, and · L ∞ := ess sup · .

Proof. The representation (4.3) is a direct consequence of the property (4.2). For any x0 , we have
δ
δ

δ (x0 ) − x0 = f( (x ))dt = δ η 1 f(t (x0 ))dt ,
t 0
δ
0 0

where η(x) = x. Since η is a convex function, it satisﬁes the Jensen’s inequality
δ δ
1 1
η f(t (x0 ))dt ≤ η f(t (x0 )) dt .
δ δ
0 0

Thus we obtain
δ
δ (x0 ) − x0 ≤ f(t (x0 )) dt ≤ δ sup f(t (x0 )),
t ∈[0,δ]
0

which implies (4.4). For any x0 ∈ D δ , we have t (x0 ) ∈ D for 0 ≤ t ≤ δ . Hence

δ (x0 ) − x0 ≤ fL ∞ ( D ) , ∀x0 ∈ D δ .
K
This yields (4.5), and the proof is complete. 2

This estimate can serve as a theoretical justification of the ResNet method (K = 1) and RT-ResNet method (K ≥ 1). As
long as is reasonably small, the flow map of the underlying dynamical system is close to identity. Therefore, it is natural
to use ResNet, which explicitly introduces the identity operator, to approximate the “residue” of the flow map. The norm of
the DNN operator N, which approximates the residual flow map, δ − I, becomes small at O (). For RT-ResNet with K > 1,
its norm becomes even smaller at O (/ K ). We remark that it was pointed out empirically in [9] that using multiple ResNet
blocks can result in networks with smaller norm.

4.3. Error bound

Let N denote the neural network approximation operator to the -lag ﬂow map . For the proposed ResNet (3.4),
RT-ResNet (3.10), and RS-ResNet (3.13), the operators can be written as

N = I + N(•; ), ResNet;

N = I + N(•; ) ◦ · · · ◦ I + N(•; ) , RT-ResNet;
(4.6)
K −fold

N = I + N(•; K −1 ) ◦ · · · ◦ I + N(•; 0 ) , RS-ResNet.
We now derive a general error bound for the solution approximation using the DNN operator N . This bound serves a
general guideline for the error growth. More speciﬁc error bounds for each different network structure are more involved
and will be pursued in a future work.
Let y(m) denote the solution of the approximate model at time t (m) := t 0 + m. Let E (m) := y(m) − x(t (m) ) denote the
error, j = 0, 1, . . . , m.
T. Qin et al. / Journal of Computational Physics 395 (2019) 620–635 627

Theorem 4.3. Assume that the same assumptions in Lemma 4.1 hold, and let us further assume

1. N − L ∞ ( D ) < +∞,

2. y(i ) , x(t (i ) ) ∈ D for 0 ≤ i ≤ m − 1,

then we have
m
1 + eL −1
E (m)
≤ 1+e L m ( 0)
E + N − L ∞ ( D . (4.7)
) eL

Proof. The triangle inequality implies that

E (m) = y(m−1) + N (y(m−1) ) − x(t (m−1) ) − (x(t (m−1) ))

≤ y(m−1) − x(t (m−1) ) + N (y(m−1) ) − (x(t (m−1) ))

≤ y(m−1) − x(t (m−1) ) + N (y(m−1) ) − (y(m−1) )

+ (y(m−1) ) − (x(t (m−1) ))

≤ y(m−1) − x(t (m−1) ) + N − L ∞ ( D )

+ e L y(m−1) − x(t (m−1) )

= 1 + e L E (m−1) + N − L ∞ ( D ) ,

where the Lipschitz continuity of the ﬂow map, shown in (4.1), has been used in the last inequality. Recursively using the
above estimate gives

E (m) ≤ 1 + e L E (m−1) + N − L ∞ ( D )

≤ 1+e L 2 (m−2)
E
+ N − L ∞ ( D ) 1 + 1 + e L

≤ ···

m −1
m i
≤ 1 + e L E (0) + N − L ∞ ( D 1 + eL .
)
i =0

The proof is complete. 2

5. Numerical examples

In this section we present numerical examples to verify the properties of the proposed methods. In all the examples, we
(1) (2) J
generate the training data pairs {(z j , z j } j =1 in the following way:

(1) J
• Generate J points {z j } j=1 from uniform distribution over a computational domain D. The domain D is a region in
which we are interested in the solution behavior. It is typically chosen to be a hypercube prior to the computation.
(1)
• For each j, starting from z j , we march forward for a time lag the underlying governing equation, using a highly
(2)
accurate standard ODE solver, to generate z j . In our examples we set = 0.1.

In each example, we take 20 times as many data pairs as the number of model parameters. We remark that the time
lag = 0.1 is relatively coarse and prevents accurate estimate of time derivatives via numerical differentiation. Since our
proposed methods employ the integral form of the underlying equation, this difficulty is circumvented. The random sampling
of the solution trajectories of length follows from the work of [52], where it was established that such kind of dense
sampling of short trajectories is highly effective for equation recovery.
All of our network models, ResNet, RT-ResNet, and RS-ResNet, are trained via the loss function (3.3) and by using the
open-source Tensorflow library [1]. The training data set is divided into mini-batches of size 10. And we typically train the
model for 500 epochs and reshuffle the training data in each epoch. All the weights are initialized randomly from Gaussian
distributions and all the biases are initialized to be zeros.
After training the network models satisfactorily, using the data of = 0.1 time lag, we march the trained network
models further forward in time and compare the results against the reference states, which are produced by high-order nu-
merical solvers of the true underlying governing equations. We march the trained network systems up to t to examine
their (relatively) long-term behaviors. For the two linear examples, we set t = 2; and for the two nonlinear examples, we
set t = 20.
628 T. Qin et al. / Journal of Computational Physics 395 (2019) 620–635

Fig. 5.1. Trajectory and phase plots for the Example 1 with x0 = (1.5, 0) for t ∈ [0, 2]. Top row: one-step ResNet model; Middle row: Multi-step RT-ResNet
model; Bottom row: Multi-step RS-ResNet model.

5.1. Linear ODEs

We ﬁrst study two linear ODE systems, as textbook examples. In both examples, our one-step ResNet method has 3
hidden layers, each of which has 30 neurons. For the multi-step RT-ResNet and RS-ResNet methods, they both have 3
ResNet blocks (K = 3), each of which contains 3 hidden layers with 20 neurons in each layer.

Example 1
We ﬁrst consider the following two-dimensional linear ODE with x = (x1 , x2 )

ẋ1 = x1 + x2 − 2,
(5.1)
ẋ2 = x1 − x2 ,
d
where ẋ represents the time derivative dt x. The computational domain D is taken to be D = [0, 2]2 .
Upon training the three network models satisfactorily, using the = 0.1 data pairs, we march the trained models further
in time up to t = 2. In Fig. 5.1, we show the plots for the trajectories of both x1 and x2 as well as the portrait on the (x1 , x2 )
phase plane. We observe that all three network models produce accurate prediction results for time up to t = 2.
As discussed in Section 3.2, the multi-step RT-ResNet method is able to produce an approximation over a smaller time
step δ = / K , which in this case is δ = 1/30 (with K = 3). The trained RT-ResNet model then allows us to produce predic-
T. Qin et al. / Journal of Computational Physics 395 (2019) 620–635 629

Fig. 5.2. Trajectory and phase plots for Example 1 with x0 = (1.5, 0) using RT-ResNet model with K = 3. The solutions are marched into over time step
δ = / K = 1/30.

Fig. 5.3. Phase plots for Example 1 with x0 = (1.5, 0). One-step ResNets. The training data contains 2% (left) and 5% (right) noise, respectively.

tions over the ﬁner time step δ . In Fig. 5.2, we show the time marching of the trained RT-ResNet model for up to t = 2 using
the smaller time step δ . The results again agree very well with the reference solution. This demonstrates the capability of
RT-ResNet – it allows us to produce accurate predictions with a resolution higher than that of the given data, i.e., δ < .
On the other hand, our numerical tests also reveal that the training of RT-ResNet with K > 1 becomes more involving –
more training data are typically required and convergence can be slower, compared to the training of the one-step ResNet
method. Similar behavior is also observed in multi-step RS-ResNet method with K > 1. The development of eﬃcient training
procedures for multi-step RT-ResNet and RS-ResNet methods is necessary and will be pursued in a future work.
(1) (1) (2) (2) J
We then consider the case of noisy data. The data pairs are set as {z j (1 + j ), z j (1 + j )} j =1 , where the relatively
(1) (2)
noise levels j and j are drawn from uniform distribution over [0, η]. In the following experiments we set η = 0.02
and 0.05 for demonstrative purpose. In Fig. 5.3, we show the phase plots generated by the one-step ResNet. Since our NN
models are based on the integral form of the ODE, they tolerate the training noise quite well. As the noise level increases,
the NN prediction deviates more from the exact dynamics, while the main structure of the solution is still well captured.
On the other hand, since the time lag is = 0.1, this noisy data case is certainly not amenable to the standard equation
recovery approaches requiring time derivative computations.

Example 2
We now consider another linear ODE system:

ẋ1 = x1 − 4x2 ,
(5.2)
ẋ2 = 4x1 − 7x2 .

The numerical results for the three trained network models are presented in Fig. 5.4. Again, we show the prediction results
of the trained models for up to t = 2. While all predictions agree well with the reference solution, one can visually see that
the RS-ResNet model is more accurate than the RT-ResNet model, which in turn is more accurate than the one-step ResNet
630 T. Qin et al. / Journal of Computational Physics 395 (2019) 620–635

Fig. 5.4. Trajectory and phase plots for the Example 2 with x0 = (0, −1). Top row: one-step ResNet model; Middle row: Multi-step RT-ResNet model; Bottom
row: Multi-step RS-ResNet model.

model. This is expected, as the multi-step methods should be more accurate than the one-step method (ResNet), and the
RS-ResNet should be even more accurate due to its adaptive nature. On the other hand, RS-ResNet introduces larger number
of parameters and induces more training cost. For a given problem, the balance between accuracy and training cost should
be considered by the user.
In Fig. 5.5, we present the results for noisy data case. The noisy data are generated in the same manner as in Exam-
ple 1, and the simulation results are obtained by the one-step ResNet method. Again, we observe that the proposed model
performs robustly in the presence of data noise.

5.2. Nonlinear ODEs

We now consider two nonlinear problems. The ﬁrst one is the well known damped pendulum problem, and the second
one is an nonlinear differential-algebraic equation (DAE) for modelling a generic toggle ([17]). In both examples, our one-
step ResNet model has 2 hidden layers, each of which has 40 neurons. Our multi-step RT-ResNet and RS-ResNet models
both have 3 of the same ResNet blocks (K = 3). Again, our training data are collected over = 0.1 time lag. We produce
predictions of the trained model over time for up to t = 20 and compare the results against the reference solutions.
T. Qin et al. / Journal of Computational Physics 395 (2019) 620–635 631

Fig. 5.5. Phase plots for Example 2 with x0 = (0, −1). One-step ResNets. The training data contains 2% (left) and 5% (right) relative noise, respectively.

Example 3: Damped pendulum

The ﬁrst nonlinear example we are considering is the following damped pendulum problem,

ẋ1 = x2 ,
ẋ2 = −α x2 − β sin x1 ,

where α = 8.91 and β = 0.2. The computational domain is D = [−π , π ] × [−2π , 2π ]. In Fig. 5.6, we present the prediction
results by the three network models, starting from the initial condition x0 = (−1.193, −3.876) and for time up to t = 20.
We observe excellent agreements between the network models and the reference solution.

Example 4: Genetic toggle switch

We now consider a system of nonlinear differential-algebraic equations (DAE), which are used to model a genetic toggle
switch in Escherichia coli ([17]). It is composed of two repressors and two constitutive promoters, where each promoter is
inhibited by the represssor that is transcribed by the opposing promoter. Details of experimental measurement can be found
in [10]. This system of equations are as follows,
⎧
⎪
⎪ ẋ = α1 − x1 ,
⎨ 1 1+xβ 2
α2
⎪ ẋ2 = 1+ z γ
− x2 ,
⎪
⎩z = x1
(1+[IPTG]/ K )η .

In this system, the components x1 and x2 denote the concentration of the two repressors. The parameters α1 and α2 are
the effective rates of the synthesis of the repressors; β and γ represent cooperativity of repression of the two promoters,
respectively; [IPTG] is the concentration of IPTG, the chemical compound that induces the switch; and K is the dissociation
constant of IPTG.
In the following numerical experiment, we take α1 = 156.25, α2 = 15.6, γ = 1, β = 2.5, K = 2.9618 × 10−5 and [IPDG] =
10−5 . We consider the computational domain D = [0, 20]2 .
In Fig. 5.7 we present the prediction results generated by the ResNet, the RT-ResNet and the RS-ResNet, for time up to
t = 20. The initial condition is x0 = (19, 17). Again, all these three models produce accurate approximations, even for such
a long-time simulation.

6. Conclusion

We presented several deep neural network (DNN) structures for approximating unknown dynamical systems using tra-
jectory data. The DNN structures are based on residual network (ResNet), which is a one-step method exact time integrator.
Two multi-step variations were presented. One is recurrent ResNet (RT-ResNet) and the other one is recursive ResNet
(RS-ResNet). Upon successful training, the methods produce discrete dynamical systems that approximate the underlying
unknown governing equations. All methods are based on integral form of the underlying system. Consequently, their con-
structions do not require time derivatives of the trajectory data and can work with coarsely distributed data as well. We
presented the construction details of the methods, their theoretical justiﬁcations, and used several examples to demonstrate
the effectiveness of the methods.
632 T. Qin et al. / Journal of Computational Physics 395 (2019) 620–635

Fig. 5.6. Trajectory and phase plots for the Example 3 with x0 = (−1.193, −3.876). Top row: one-step ResNet model; Middle row: Multi-step RT-ResNet
model; Bottom row: Multi-step RS-ResNet model. (For interpretation of the colors in the ﬁgure(s), the reader is referred to the web version of this article.)
T. Qin et al. / Journal of Computational Physics 395 (2019) 620–635 633

Fig. 5.7. Trajectory and the phase plots for the Example 4 with x0 = (19, 17). Top row: one-step ResNet model; Middle row: Multi-step RT-ResNet model;
Bottom row: Multi-step RS-ResNet model.
634 T. Qin et al. / Journal of Computational Physics 395 (2019) 620–635

References

[1] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G.S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving,
M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mané, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner,
I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viégas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, X. Zheng, TensorFlow:
large-scale machine learning on heterogeneous systems, https://www.tensorflow.org/, 2015. Software available from tensorflow.org.
[2] A.R. Barron, Universal approximation bounds for superpositions of a sigmoidal function, IEEE Trans. Inf. Theory 39 (1993) 930–945.
[3] P.L. Bartlett, S.N. Evans, P.M. Long, Representing smooth functions as compositions of near-identity functions with implications for deep network
optimization, preprint, arXiv:1804.05012, 2018.
[4] M. Bianchini, F. Scarselli, On the complexity of neural network classifiers: a comparison between shallow and deep architectures, IEEE Trans. Neural
Netw. Learn. Syst. 25 (2014) 1553–1565.
[5] J. Bongard, H. Lipson, Automated reverse engineering of nonlinear dynamical systems, Proc. Natl. Acad. Sci. USA 104 (2007) 9943–9948.
[6] S.L. Brunton, B.W. Brunton, J.L. Proctor, E. Kaiser, J.N. Kutz, Chaos as an intermittently forced linear system, Nat. Commun. 8 (2017).
[7] S.L. Brunton, J.L. Proctor, J.N. Kutz, Discovering governing equations from data by sparse identification of nonlinear dynamical systems, Proc. Natl. Acad.
Sci. USA 113 (2016) 3932–3937.
[8] S. Chan, A. Elsheikh, A machine learning approach for efficient uncertainty quantification using multiscale methods, J. Comput. Phys. 354 (2018)
494–511.
[9] B. Chang, L. Meng, E. Haber, F. Tung, D. Begert, Multi-level residual networks from dynamical systems view, in: International Conference on Learning
Representations, 2018.
[10] R. Chartrand, Numerical differentiation of noisy, nonsmooth data, ISRN Appl. Math. 2011 (2011).
[11] R.T.Q. Chen, Y. Rubanova, J. Bettencourt, D. Duvenaud, Neural ordinary differential equations, preprint, arXiv:1806.07366, 2018.
[12] B.C. Daniels, I. Nemenman, Automated adaptive inference of phenomenological dynamical models, Nat. Commun. 6 (2015).
[13] B.C. Daniels, I. Nemenman, Efficient inference of parsimonious phenomenological models of cellular dynamics using S-systems and alternating regres-
sion, PLoS ONE 10 (2015) e0119821.
[14] K.-L. Du, M. Swamy, Neural Networks and Statistical Learning, Springer-Verlag, 2014.
[15] W. E, B. Engquist, Z. Huang, Heterogeneous multiscale method: a general methodology for multiscale modeling, Phys. Rev. B 67 (2003) 092101.
[16] R. Eldan, O. Shamir, The power of depth for feedforward neural networks, in: Conference on Learning Theory, 2016, pp. 907–940.
[17] T.S. Gardner, C.R. Cantor, J.J. Collins, Construction of a genetic toggle switch in escherichia coli, Nature 403 (2000) 339.
[18] D. Giannakis, A.J. Majda, Nonlinear Laplacian spectral analysis for time series with intermittency and low-frequency variability, Proc. Natl. Acad. Sci.
USA 109 (2012) 2222–2227.
[19] R. Gonzalez-Garcia, R. Rico-Martinez, I.G. Kevrekidis, Identification of distributed parameter systems: a neural net based approach, Comput. Chem. Eng.
22 (1998) S965–S968.
[20] I. Goodfellow, Y. Bengio, A. Courville, Deep Learning, MIT Press, 2016.
[21] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition, 2016, pp. 770–778.
[22] J. Hesthaven, S. Ubbiali, Non-intrusive reduced order modeling of nonlinear problems using neural networks, J. Comput. Phys. 363 (2018) 55–78.
[23] K. Hornik, Approximation capabilities of multilayer feedforward networks, Neural Netw. 4 (1991) 251–257.
[24] I.G. Kevrekidis, C.W. Gear, J.M. Hyman, P.G. Kevrekidid, O. Runborg, C. Theodoropoulos, et al., Equation-free, coarse-grained multiscale computation:
enabling mocroscopic simulators to perform system-level analysis, Commun. Math. Sci. 1 (2003) 715–762.
[25] Y. Khoo, J. Lu, L. Ying, Solving parametric pde problems with artificial neural networks, preprint, arXiv:1707.03351, 2018.
[26] M. Leshno, V.Y. Lin, A. Pinkus, S. Schocken, Multilayer feedforward networks with a nonpolynomial activation function can approximate any function,
Neural Netw. 6 (1993) 861–867.
[27] Z. Long, Y. Lu, X. Ma, B. Dong, PDE-Net: learning PDEs from data, preprint, arXiv:1710.09668, 2017.
[28] N.M. Mangan, J.N. Kutz, S.L. Brunton, J.L. Proctor, Model selection for dynamical systems via sparse regression and information criteria, Proc. R. Soc.
Lond., Ser. A, Math. Phys. Eng. Sci. 473 (2017).
[29] A. Mardt, L. Pasquali, H. Wu, F. Noe, VAMPnets for deep learning of molecular kinetics, Nat. Commun. 9 (2018) 5.
[30] G.F. Montufar, R. Pascanu, K. Cho, Y. Bengio, On the number of linear regions of deep neural networks, in: Advances in Neural Information Processing
Systems, 2014, pp. 2924–2932.
[31] A. Pinkus, Approximation theory of the MLP model in neural networks, Acta Numer. 8 (1999) 143–195.
[32] T. Poggio, H. Mhaskar, L. Rosasco, B. Miranda, Q. Liao, Why and when can deep-but not shallow-networks avoid the curse of dimensionality: a review,
Int. J. Autom. Comput. 14 (2017) 503–519.
[33] M. Raissi, Deep hidden physics models: deep learning of nonlinear partial differential equations, preprint, arXiv:1801.06637, 2018.
[34] M. Raissi, G.E. Karniadakis, Hidden physics models: machine learning of nonlinear partial differential equations, J. Comput. Phys. 357 (2018) 125–141.
[35] M. Raissi, P. Perdikaris, G.E. Karniadakis, Machine learning of linear differential equations using gaussian processes, J. Comput. Phys. 348 (2017)
683–693.
[36] M. Raissi, P. Perdikaris, G.E. Karniadakis, Multistep neural networks for data-driven discovery of nonlinear dynamical systems, preprint, arXiv:1801.
01236, 2018.
[37] D. Ray, J. Hesthaven, An artificial neural network as a troubled-cell indicator, J. Comput. Phys. 367 (2018) 166–191.
[38] S.H. Rudy, S.L. Brunton, J.L. Proctor, J.N. Kutz, Data-driven discovery of partial differential equations, Sci. Adv. 3 (2017) e1602614.
[39] S.H. Rudy, J.N. Kutz, S.L. Brunton, Deep learning of dynamics and signal-noise decomposition with time-stepping constraints, preprint, arXiv:1808.
02578, 2018.
[40] H. Schaeffer, Learning partial differential equations via data discovery and sparse optimization, Proc. R. Soc. Lond., Ser. A, Math. Phys. Eng. Sci. 473
(2017).
[41] H. Schaeffer, S.G. McCalla, Sparse model selection via integral terms, Phys. Rev. E 96 (2017) 023302.
[42] J. Schmidhuber, Deep learning in neural networks: an overview, Neural Netw. 61 (2015) 85–117.
[43] M. Schmidt, H. Lipson, Distilling free-form natural laws from experimental data, Science 324 (2009) 81–85.
[44] M.D. Schmidt, R.R. Vallabhajosyula, J.W. Jenkins, J.E. Hood, A.S. Soni, J.P. Wikswo, H. Lipson, Automated refinement and inference of analytical models
for metabolic networks, Phys. Biol. 8 (2011) 055011.
[45] A. Stuart, A.R. Humphries, Dynamical Systems and Numerical Analysis, vol. 2, Cambridge University Press, 1998.
[46] G. Sugihara, R. May, H. Ye, C. Hsieh, E. Deyle, M. Fogarty, S. Munch, Detecting causality in complex ecosystems, Science 338 (2012) 496–500.
[47] R. Tibshirani, Regression shrinkage and selection via the lasso, J. R. Stat. Soc., Ser. B, Methodol. (1996) 267–288.
[48] G. Tran, R. Ward, Exact recovery of chaotic systems from highly corrupted data, Multiscale Model. Simul. 15 (2017) 1108–1129.
[49] R. Tripathy, I. Bilionis, Deep UQ: learning deep neural network surrogate model for high dimensional uncertainty quantification, J. Comput. Phys. 375
(2018) 565–588.
T. Qin et al. / Journal of Computational Physics 395 (2019) 620–635 635

[50] H.U. Voss, P. Kolodner, M. Abel, J. Kurths, Amplitude equations from spatiotemporal binary-ﬂuid convection data, Phys. Rev. Lett. 83 (1999) 3422.
[51] Y. Wang, S.W. Cheung, E.T. Chung, Y. Efendiev, M. Wang, Deep multiscale model learning, preprint, arXiv:1806.04830, 2018.
[52] K. Wu, D. Xiu, Numerical aspects for approximating governing equations using data, J. Comput. Phys. (2019), in press.
[53] H. Ye, R.J. Beamish, S.M. Glaser, S.C.H. Grant, C. Hsieh, L.J. Richards, J.T. Schnute, G. Sugihara, Equation-free mechanistic ecosystem forecasting using
empirical dynamic modeling, Proc. Natl. Acad. Sci. USA 112 (2015) E1569–E1576.
[54] Y. Zhu, N. Zabaras, Bayesian deep convolutional encoder-decoder networks for surrogate modeling and uncertainty quantiﬁcation, J. Comput. Phys. 366
(2018) 415–447.

Physics-Informed Neural Networks
100% (1)
Physics-Informed Neural Networks
22 pages
Ai Lorenz Pinn
No ratings yet
Ai Lorenz Pinn
28 pages
2309.13722 Deep Neural Networks With ReLU
No ratings yet
2309.13722 Deep Neural Networks With ReLU
52 pages
An Overview On Machine Learning Methods For Partial Differential Equations From Physics Informed Neural Networks To Deep Operator Learning
No ratings yet
An Overview On Machine Learning Methods For Partial Differential Equations From Physics Informed Neural Networks To Deep Operator Learning
59 pages
1812 11285v4
No ratings yet
1812 11285v4
46 pages
Batlle Et Al. - 2023 - Kernel Methods Are Competitive For Operator Learning
No ratings yet
Batlle Et Al. - 2023 - Kernel Methods Are Competitive For Operator Learning
36 pages
Zhang
No ratings yet
Zhang
49 pages
2205.14398 Deep Neural Networks Overcome The Curse
No ratings yet
2205.14398 Deep Neural Networks Overcome The Curse
34 pages
Danh Gia U-V
No ratings yet
Danh Gia U-V
29 pages
36 Neural Operator Graph Kernel N
No ratings yet
36 Neural Operator Graph Kernel N
21 pages
Applsci 15 04706
No ratings yet
Applsci 15 04706
20 pages
Nueral FLOWS
No ratings yet
Nueral FLOWS
24 pages
Mgfno: Multi-Grid Architecture Fourier Neural Operator For Parametric Partial Differential Equations
No ratings yet
Mgfno: Multi-Grid Architecture Fourier Neural Operator For Parametric Partial Differential Equations
29 pages
2304.00388 Multilevel CNNs For Parametric PDEs
No ratings yet
2304.00388 Multilevel CNNs For Parametric PDEs
42 pages
Practical Aspects On Solving Differential Equations Using Deep Learning
No ratings yet
Practical Aspects On Solving Differential Equations Using Deep Learning
32 pages
High Precision Differentiation Techniques For Data-Driven Solution of Nonlinear Pdes by Physics-Informed Neural Networks
No ratings yet
High Precision Differentiation Techniques For Data-Driven Solution of Nonlinear Pdes by Physics-Informed Neural Networks
28 pages
A Rate of Convergence of Physics Informed Neural Networks For The Linear Second Order Elliptic PDEs
No ratings yet
A Rate of Convergence of Physics Informed Neural Networks For The Linear Second Order Elliptic PDEs
24 pages
PDE-LEARN - Using Deep Learning To Discover PDE From Noisy, Limited Data
No ratings yet
PDE-LEARN - Using Deep Learning To Discover PDE From Noisy, Limited Data
25 pages
Deep ONet
No ratings yet
Deep ONet
22 pages
AI-Aristotle - A Physics-Informed Framework For Systems Biology Gray-Box
No ratings yet
AI-Aristotle - A Physics-Informed Framework For Systems Biology Gray-Box
26 pages
Neural ODE
No ratings yet
Neural ODE
21 pages
(00000) - 2018-Weinan - (CommMathStat) - The Deep Ritz Method A Deep Learning-Based
No ratings yet
(00000) - 2018-Weinan - (CommMathStat) - The Deep Ritz Method A Deep Learning-Based
12 pages
Koopman Neural Operator As A Mesh-Free Solver of Non-Linear Partial Differential Equations
No ratings yet
Koopman Neural Operator As A Mesh-Free Solver of Non-Linear Partial Differential Equations
18 pages
(Gin, Craig, Et Al.), Deep Learning Models For Global Coordinate Transformations That Linearize Pdes., Arxiv Preprint Arxiv-1911.02710 (2019) .
No ratings yet
(Gin, Craig, Et Al.), Deep Learning Models For Global Coordinate Transformations That Linearize Pdes., Arxiv Preprint Arxiv-1911.02710 (2019) .
27 pages
Runge-Kutta Neural Network For Identification of Dynamical Systems in High Accuracy
No ratings yet
Runge-Kutta Neural Network For Identification of Dynamical Systems in High Accuracy
14 pages
Brajard Et Al 2021 Combining Data Assimilation and Machine Learning To Infer Unresolved Scale Parametrization
No ratings yet
Brajard Et Al 2021 Combining Data Assimilation and Machine Learning To Infer Unresolved Scale Parametrization
16 pages
A Universal Framework
No ratings yet
A Universal Framework
13 pages
D E: ODE E: EEP Uler Method Solving S by Approximating The Local Truncation Error of The Uler Method
No ratings yet
D E: ODE E: EEP Uler Method Solving S by Approximating The Local Truncation Error of The Uler Method
16 pages
Lu DeepONet NMachineIntell21
No ratings yet
Lu DeepONet NMachineIntell21
15 pages
Approximation of Solution Operators For High-Dimensional Pdes
No ratings yet
Approximation of Solution Operators For High-Dimensional Pdes
15 pages
Snode PP
No ratings yet
Snode PP
15 pages
Neural ODES
No ratings yet
Neural ODES
32 pages
Artificial Neural Network Methods For The Solution of Second Order Boundary Value Problems
No ratings yet
Artificial Neural Network Methods For The Solution of Second Order Boundary Value Problems
15 pages
Enhancing Trajectory Prediction in Complex Dynamical Systems With Neural Ordinary Differential Equations
No ratings yet
Enhancing Trajectory Prediction in Complex Dynamical Systems With Neural Ordinary Differential Equations
12 pages
Symbolic Physics Learner - Discovering Governing Equations Via Monte Carlo Tree Search Sun 2022
No ratings yet
Symbolic Physics Learner - Discovering Governing Equations Via Monte Carlo Tree Search Sun 2022
23 pages
Raissi - PIDL Part 2
No ratings yet
Raissi - PIDL Part 2
19 pages
Tac 232
No ratings yet
Tac 232
7 pages
A2
No ratings yet
A2
13 pages
Merger 02
No ratings yet
Merger 02
5 pages
Deepgraphonet: A Deep Graph Operator Network To Learn and Zero-Shot Transfer The Dynamic Response of Networked Systems
No ratings yet
Deepgraphonet: A Deep Graph Operator Network To Learn and Zero-Shot Transfer The Dynamic Response of Networked Systems
10 pages
Sciadv 1602614
No ratings yet
Sciadv 1602614
6 pages
Modeling Systems With Machine Learning Based Differential Equations
No ratings yet
Modeling Systems With Machine Learning Based Differential Equations
12 pages
A Mapping Strategy For The Identification of Structural Systems
No ratings yet
A Mapping Strategy For The Identification of Structural Systems
8 pages
Neural Operator Graph Kernel Network For Partial Differential Equations
No ratings yet
Neural Operator Graph Kernel Network For Partial Differential Equations
21 pages
PINN
100% (1)
PINN
22 pages
(Berg, Jens, and Kaj Nystrom), Data-Driven Discovery of PDEs in Complex Datasets, Journal of Computational Physics 384 (2019)
No ratings yet
(Berg, Jens, and Kaj Nystrom), Data-Driven Discovery of PDEs in Complex Datasets, Journal of Computational Physics 384 (2019)
14 pages
Solving Flows of Dynamical Systems by Deep Neural Networks and A Novel Deep Learning Algorithm
No ratings yet
Solving Flows of Dynamical Systems by Deep Neural Networks and A Novel Deep Learning Algorithm
12 pages
Approximation of Solution Operators for High-dimensional PDEs部分2
No ratings yet
Approximation of Solution Operators for High-dimensional PDEs部分2
2 pages
Journal of Computational Physics: Zichao Long, Yiping Lu, Bin Dong
No ratings yet
Journal of Computational Physics: Zichao Long, Yiping Lu, Bin Dong
17 pages
Approximation of Solution Operators for High-dimensional PDEs部分1
No ratings yet
Approximation of Solution Operators for High-dimensional PDEs部分1
2 pages
Approximation of Solution Operators for High-dimensional PDEs部分5
No ratings yet
Approximation of Solution Operators for High-dimensional PDEs部分5
2 pages
Approximation of Solution Operators for High-dimensional PDEs部分3
No ratings yet
Approximation of Solution Operators for High-dimensional PDEs部分3
2 pages
An Investigation of Iterative Methods For Large Scale Linear Systems
No ratings yet
An Investigation of Iterative Methods For Large Scale Linear Systems
77 pages
Gray Box 1994
No ratings yet
Gray Box 1994
10 pages
Ffjord: F - C D S R G M: REE Form Ontinuous Ynamics For Calable Eversible Enerative Odels
No ratings yet
Ffjord: F - C D S R G M: REE Form Ontinuous Ynamics For Calable Eversible Enerative Odels
13 pages
pde 微分方程与神经网络
No ratings yet
pde 微分方程与神经网络
16 pages
Deep Learning Notes
No ratings yet
Deep Learning Notes
51 pages
Journal of Computational Physics: M. Raissi, P. Perdikaris, G.E. Karniadakis
No ratings yet
Journal of Computational Physics: M. Raissi, P. Perdikaris, G.E. Karniadakis
22 pages
Systems, Structure and Control
No ratings yet
Systems, Structure and Control
188 pages
1-Resnet Slides
No ratings yet
1-Resnet Slides
89 pages
Draft 2
No ratings yet
Draft 2
71 pages
(GPU-MODE) Quantized Training (20241006)
No ratings yet
(GPU-MODE) Quantized Training (20241006)
26 pages
A Review of Deep Learning Approaches in Clinical and Healthcare Systems Based On Medical Image Analysis
No ratings yet
A Review of Deep Learning Approaches in Clinical and Healthcare Systems Based On Medical Image Analysis
42 pages
Brain Tumour Detection
No ratings yet
Brain Tumour Detection
32 pages
Facial Emotion Recognition Based Real-Time Learner Engagement Detection System in Online Learning
No ratings yet
Facial Emotion Recognition Based Real-Time Learner Engagement Detection System in Online Learning
61 pages
5-Convolutional Neural Network
No ratings yet
5-Convolutional Neural Network
43 pages
COMP9491 Week2 Deep - Learning 1
No ratings yet
COMP9491 Week2 Deep - Learning 1
66 pages
Research Paper Organized
No ratings yet
Research Paper Organized
9 pages
First Review PDF
No ratings yet
First Review PDF
36 pages
GLAUCOMA DETECTION Final
No ratings yet
GLAUCOMA DETECTION Final
15 pages
Chess
No ratings yet
Chess
24 pages
Classification of Freshwater Fish Diseases in Bangladesh Using A Novel Ensemble Deep Learning Model Enhancing Accuracy and Interpretability
No ratings yet
Classification of Freshwater Fish Diseases in Bangladesh Using A Novel Ensemble Deep Learning Model Enhancing Accuracy and Interpretability
25 pages
Fish Freshness Detection Phase 1
No ratings yet
Fish Freshness Detection Phase 1
37 pages
V1 0-Mdpi
No ratings yet
V1 0-Mdpi
36 pages
MLT Kai601 2022-23 External
No ratings yet
MLT Kai601 2022-23 External
36 pages
BDCC 08 00116 v2
No ratings yet
BDCC 08 00116 v2
23 pages
Case Studies
No ratings yet
Case Studies
17 pages
FPGA Implementation of Object Detection Accelerator Based On Vitis-AI
No ratings yet
FPGA Implementation of Object Detection Accelerator Based On Vitis-AI
7 pages
A Guide To Machine Learning and Computer Vision - How They Work Together
No ratings yet
A Guide To Machine Learning and Computer Vision - How They Work Together
6 pages
Deep Learning Report
No ratings yet
Deep Learning Report
21 pages
L3good Neural
No ratings yet
L3good Neural
18 pages
DeepLearningForVisionSystems Ch5 ResNet
No ratings yet
DeepLearningForVisionSystems Ch5 ResNet
24 pages
Xuelin Qian Pose-Normalized Image Generation ECCV 2018 Paper
No ratings yet
Xuelin Qian Pose-Normalized Image Generation ECCV 2018 Paper
18 pages
Smart Fiber-Optic Distributed Acoustic Sensing sDAS With Multitask Learning For Time-Efficient Ground Listening Applications
No ratings yet
Smart Fiber-Optic Distributed Acoustic Sensing sDAS With Multitask Learning For Time-Efficient Ground Listening Applications
15 pages
Research On Maize Seed
No ratings yet
Research On Maize Seed
16 pages
A Siamese Neural Network Based Face Recognition From Masked Faces ARXIV
No ratings yet
A Siamese Neural Network Based Face Recognition From Masked Faces ARXIV
11 pages
CNN Based Deep Learning Model For Deepfake Detection
No ratings yet
CNN Based Deep Learning Model For Deepfake Detection
5 pages
A Comprehensive Study Classification of Asian Ethnicities From Facial Images Using Deep Learning
No ratings yet
A Comprehensive Study Classification of Asian Ethnicities From Facial Images Using Deep Learning
5 pages
Data Science through R. Unsupervised Learning. Dimension Reduction Techniques: Principal Components, Factor Analysis and Correspondence Analysis
From Everand
Data Science through R. Unsupervised Learning. Dimension Reduction Techniques: Principal Components, Factor Analysis and Correspondence Analysis
César Pérez López
No ratings yet
Hybrid Neural Networks: Fundamentals and Applications for Interacting Biological Neural Networks with Artificial Neuronal Models
From Everand
Hybrid Neural Networks: Fundamentals and Applications for Interacting Biological Neural Networks with Artificial Neuronal Models
Fouad Sabry
No ratings yet

1 s2.0 S0021999119304504 Main

Uploaded by

1 s2.0 S0021999119304504 Main

Uploaded by

Journal of Computational Physics 395 (2019) 620–635

Contents lists available at ScienceDirect

Journal of Computational Physics

Data driven governing equations approximation using deep

Let us consider an autonomous system

x(t ; x0 , t 0 ) = t −t0 (x0 ). (2.2)

3. Deep neural network approximation

yout = N(yin ; ), (3.1)

N(·; ) = (σ M ◦ W M −1 ) ◦ · · · ◦ (σ2 ◦ W1 ), (3.2)

3.1. One-step ResNet approximation

yout = yin + N(yin ; ), yin = z(1) , (3.4)

Fig. 3.1. Schematic of the ResNet structure for one-step approximation.

φ (x; f) = · f(τ (x)), (3.6)

for some 0 ≤ τ ≤ such that

x() = x + φ (x; f). (3.7)

N(x; ) ≈ φ (x; f). (3.8)

y(k+1) = y(k) + N(y(k) ; ), k = 0, . . . . (3.9)

3.2. Multi-step recurrent ResNet (RT-ResNet) approximation

N(x; ) ≈ φ δ (x; f). (3.12)

3.3. Multi-step recursive ResNet approximation

where φ δk (x; f) is the δk effective increment deﬁned in Deﬁnition 3.1.

N(x; k ) ≈ φ δk (x; f), k = 0, . . . , K − 1 . (3.15)

4.1. Continuity of ﬂow map

t (x0 ) − t (x0 ) ≤ e Lt x0 − x0 , ∀t ∈ [0, τ ]. (4.1)

4.2. Compositions of ﬂow maps

t1 ◦ t2 = t1 +t2 , ∀t 1 , t 2 . (4.2)

where δ = / K , and δ satisﬁes

Suppose that f is bounded on D ⊆ Rn , then

which implies (4.4). For any x0 ∈ D δ , we have t (x0 ) ∈ D for 0 ≤ t ≤ δ . Hence

4.3. Error bound

N = I + N(•; ), ResNet;

Proof. The triangle inequality implies that

The proof is complete. 2

5.1. Linear ODEs

5.2. Nonlinear ODEs

Example 3: Damped pendulum

Example 4: Genetic toggle switch

You might also like

t (x0 ) − t (x0 ) ≤ e Lt x0 − x0 , ∀t ∈ [0, τ ]. (4.1)