0% found this document useful (0 votes)

98 views19 pages

Physics-Informed Deep Learning

This document discusses using physics-informed neural networks to discover nonlinear partial differential equations from limited data. It introduces two approaches: continuous time models that use scattered spatiotemporal data, and discrete time models that use snapshots. As an example, the method is applied to Burgers' equation to predict the solution and identify the governing PDE from sparse noisy data, demonstrating the effectiveness of the physics-informed neural network approach.

Uploaded by

josephpraful

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

98 views19 pages

Physics-Informed Deep Learning

Uploaded by

josephpraful

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Physics Informed Deep Learning (Part II): Data-driven

Discovery of Nonlinear Partial Differential Equations

Maziar Raissi1 , Paris Perdikaris2 , and George Em Karniadakis1
1
Division of Applied Mathematics, Brown University,
arXiv:1711.10566v1 [[Link]] 28 Nov 2017

Providence, RI, 02912, USA

2
Department of Mechanical Engineering and Applied Mechanics,
University of Pennsylvania,
Philadelphia, PA, 19104, USA

Abstract
We introduce physics informed neural networks – neural networks that are
trained to solve supervised learning tasks while respecting any given law of
physics described by general nonlinear partial differential equations. In this
second part of our two-part treatise, we focus on the problem of data-driven
discovery of partial differential equations. Depending on whether the avail-
able data is scattered in space-time or arranged in fixed temporal snapshots,
we introduce two main classes of algorithms, namely continuous time and
discrete time models. The effectiveness of our approach is demonstrated us-
ing a wide range of benchmark problems in mathematical physics, including
conservation laws, incompressible fluid flow, and the propagation of nonlinear
shallow-water waves.
Keywords:
Data-driven scientific computing, Machine learning, Predictive modeling,
Runge-Kutta methods, Nonlinear dynamics

1. Introduction
Deep learning has gained unprecedented attention over the last few years,
and deservedly so, as it has introduced transformative solutions across diverse
scientific disciplines [1, 2, 3, 4]. Despite the ongoing success, there exist many
scientific applications that have yet failed to benefit from this emerging tech-
nology, primarily due to the high cost of data acquisition. It is well known

Preprint submitted to Journal Name November 30, 2017

that the current state-of-the-art machine learning tools (e.g., deep/convolu-
tional/recurrent neural networks) are lacking robustness and fail to provide
any guarantees of convergence when operating in the small data regime, i.e.,
the regime where very few training examples are available.

In the first part of this study, we introduced physics informed neural net-
works as a viable solution for training deep neural networks with few training
examples, for cases where the available data is known to respect a given phys-
ical law described by a system of partial differential equations. Such cases are
abundant in the study of physical, biological, and engineering systems, where
longstanding developments of mathematical physics have shed tremendous
insight on how such systems are structured, interact, and dynamically evolve
in time. We saw how the knowledge of an underlying physical law can in-
troduce structure that effectively regularizes the training of neural networks,
and enables them to generalize well even when only a few training examples
are available. Through the lens of different benchmark problems, we high-
lighted the key features of physics informed neural networks in the context
of data-driven solutions of partial differential equations [5, 6].

In this second part of our study, we shift our attention to the problem of
data-driven discovery of partial differential equations [7, 8, 9]. To this end,
let us consider parametrized and nonlinear partial differential equations of
the general form

ut + N [u; λ] = 0, x ∈ Ω, t ∈ [0, T ], (1)

where u(t, x) denotes the latent (hidden) solution, N [·; λ] is a nonlinear op-
erator parametrized by λ, and Ω is a subset of RD . This setup encapsulates a
wide range of problems in mathematical physics including conservation laws,
diffusion processes, advection-diffusion-reaction systems, and kinetic equa-
tions. As a motivating example, the one dimensional Burgers’ equation [10]
corresponds to the case where N [u; λ] = λ1 uux − λ2 uxx and λ = (λ1 , λ2 ).
Here, the subscripts denote partial differentiation in either time or space.
Now, the problem of data-driven discovery of partial differential equations
poses the following question: given a small set of scattered and potentially
noisy observations of the hidden state u(t, x) of a system, what are the pa-
rameters λ that best describe the observed data?

2
In what follows, we will provide an overview of our two main approaches
to tackle this problem, namely continuous time and discrete time models, as
well as a series of results and systematic studies for a diverse collection of
benchmarks. In the first approach, we will assume availability of scattered
and potential noisy measurements across the entire spatio-temporal domain.
In the latter, we will try to infer the unknown parameters λ from only two
data snapshots taken at distinct time instants. All data and codes used in
this manuscript are publicly available on GitHub at [Link]
maziarraissi/PINNs.

2. Continuous Time Models

We define f (t, x) to be given by the left-hand-side of equation (1); i.e.,

f := ut + N [u; λ], (2)

and proceed by approximating u(t, x) by a deep neural network. This as-

sumption along with equation (2) result in a physics informed neural network
f (t, x). This network can be derived by applying the chain rule for differ-
entiating compositions of functions using automatic differentiation [11]. It
is worth highlighting that the parameters of the differential operator λ turn
into parameters of the physics informed neural network f (t, x).

2.1. Example (Burgers’ Equation)

As a first example, let us consider the Burgers’ equation. This equation
arises in various areas of applied mathematics, including fluid mechanics,
nonlinear acoustics, gas dynamics, and traffic flow [10]. It is a fundamen-
tal partial differential equation and can be derived from the Navier-Stokes
equations for the velocity field by dropping the pressure gradient term. For
small values of the viscosity parameters, Burgers’ equation can lead to shock
formation that is notoriously hard to resolve by classical numerical methods.
In one space dimension the equation reads as

ut + λ1 uux − λ2 uxx = 0. (3)

Let us define f (t, x) to be given by

f := ut + λ1 uux − λ2 uxx , (4)

3
and proceed by approximating u(t, x) by a deep neural network. This will
result in the physics informed neural network f (t, x). The shared parameters
of the neural networks u(t, x) and f (t, x) along with the parameters λ =
(λ1 , λ2 ) of the differential operator can be learned by minimizing the mean
squared error loss
M SE = M SEu + M SEf , (5)
where
N
1 X
M SEu = |u(tiu , xiu ) − ui |2 ,
N i=1
and
N
1 X
M SEf = |f (tiu , xiu )|2 .
N i=1

Here, {tiu , xiu , ui }N

i=1 denote the training data on u(t, x). The loss M SEu cor-
responds to the training data on u(t, x) while M SEf enforces the structure
imposed by equation (3) at a finite set of collocation points, whose number
and location is taken to be the same as the training data.

To illustrate the effectiveness of our approach, we have created a train-

ing data-set by randomly generating N = 2, 000 points across the entire
spatio-temporal domain from the exact solution corresponding to λ1 = 1.0
and λ2 = 0.01/π. The locations of the training points are illustrated in the
top panel of figure 1. This data-set is then used to train a 9-layer deep
neural network with 20 neurons per hidden layer by minimizing the mean
square error loss of (5) using the L-BFGS optimizer [12]. Upon training,
the network is calibrated to predict the entire solution u(t, x), as well as the
unknown parameters λ = (λ1 , λ2 ) that define the underlying dynamics. A
visual assessment of the predictive accuracy of the physics informed neural
network is given in the middle and bottom panels of figure 1. The network is
able to identify the underlying partial differential equation with remarkable
accuracy, even in the case where the scattered training data is corrupted with
1% uncorrelated noise.

To further scrutinize the performance of our algorithm, we have performed

a systematic study with respect to the total number of training data, the
noise corruption levels, and the neural network architecture. The results are
summarized in tables 1 and 2. The key observation here is that the proposed

4
u(t, x)
1.0 1.00
0.75
0.5 0.50
0.25
0.0 0.00
x

−0.25
−0.5 −0.50
−0.75
−1.0 −1.00
0.0 0.2 0.4 0.6 0.8
t Data (2000 points)

t = 0.25 t = 0.50 t = 0.75

1 1 1
u(t, x)

u(t, x)

u(t, x)
0 0 0

−1 −1 −1
−1 0 1 −1 0 1 −1 0 1
x x x
Exact Prediction

Correct PDE ut + uux − 0.0031831uxx = 0

Identified PDE (clean data) ut + 0.99915uux − 0.0031794uxx = 0
Identified PDE (1% noise) ut + 1.00042uux − 0.0032098uxx = 0

Figure 1: Burgers equation: Top: Predicted solution u(t, x) along with the training data.
Middle: Comparison of the predicted and exact solutions corresponding to the three tem-
poral snapshots depicted by the dashed vertical lines in the top panel. Bottom: Correct
partial differential equation along with the identified one obtained by learning λ1 and λ2 .

methodology appears to be very robust with respect to noise levels in the

data, and yields a reasonable identification accuracy even for noise corruption
up to 10%. This enhanced robustness seems to greatly outperform competing
approaches using Gaussian process regression as previously reported in [7] as
well as approaches relying on sparse regression that require relatively clean
data for accurately computing numerical gradients [13].

2.1.1. Example (Navier-Stokes Equation)

Our next example involves a realistic scenario of incompressible fluid flow
as described by the ubiquitous Navier-Stokes equations. Navier-Stokes equa-
tions describe the physics of many phenomena of scientific and engineering

5
% error in λ1 % error in λ2
noise
0% 1% 5% 10% 0% 1% 5% 10%
Nu
500 0.131 0.518 0.118 1.319 13.885 0.483 1.708 4.058
1000 0.186 0.533 0.157 1.869 3.719 8.262 3.481 14.544
1500 0.432 0.033 0.706 0.725 3.093 1.423 0.502 3.156
2000 0.096 0.039 0.190 0.101 0.469 0.008 6.216 6.391
Table 1: Burgers’ equation: Percentage error in the identified parameters λ1 and λ2 for
different number of training data N corrupted by different noise levels. Here, the neural
network architecture is kept fixed to 9 layers and 20 neurons per layer.

% error in λ1 % error in λ2
Neurons
10 20 40 10 20 40
Layers
2 11.696 2.837 1.679 103.919 67.055 49.186
4 0.332 0.109 0.428 4.721 1.234 6.170
6 0.668 0.629 0.118 3.144 3.123 1.158
8 0.414 0.141 0.266 8.459 1.902 1.552
Table 2: Burgers’ equation: Percentage error in the identified parameters λ1 and λ2
for different number of hidden layers and neurons per layer. Here, the training data
is considered to be noise-free and fixed to N = 2, 000.

interest. They may be used to model the weather, ocean currents, water flow
in a pipe and air flow around a wing. The Navier-Stokes equations in their
full and simplified forms help with the design of aircraft and cars, the study
of blood flow, the design of power stations, the analysis of the dispersion of
pollutants, and many other applications. Let us consider the Navier-Stokes
equations in two dimensions1 (2D) given explicitly by

ut + λ1 (uux + vuy ) = −px + λ2 (uxx + uyy ),

(6)
vt + λ1 (uvx + vvy ) = −py + λ2 (vxx + vyy ),

where u(t, x, y) denotes the x-component of the velocity field, v(t, x, y) the
y-component, and p(t, x, y) the pressure. Here, λ = (λ1 , λ2 ) are the unknown

1
It is straightforward to generalize the proposed framework to the Navier-Stokes equa-
tions in three dimensions (3D).

6
parameters. Solutions to the Navier-Stokes equations are searched in the set
of divergence-free functions; i.e.,

ux + vy = 0. (7)

This extra equation is the continuity equation for incompressible fluids that
describes the conservation of mass of the fluid. We make the assumption
that
u = ψy , v = −ψx , (8)
for some latent function ψ(t, x, y).2 Under this assumption, the continuity
equation (7) will be automatically satisfied. Given noisy measurements

{ti , xi , y i , ui , v i }N
i=1

of the velocity field, we are interested in learning the parameters λ as well as

the pressure p(t, x, y). We define f (t, x, y) and g(t, x, y) to be given by

f := ut + λ1 (uux + vuy ) + px − λ2 (uxx + uyy ),

(9)
g := vt + λ1 (uvx + vvy ) + py − λ2 (vxx + vyy ),

and proceed by jointly approximating ψ(t, x, y) p(t, x, y) using a single
neural network with two outputs. This prior assumption along with equations
(8) and (9) results into a physics informed neural network f (t, x, y) g(t, x, y) .
The parameters λ of the Navier-Stokes operator
as well as the parameters
of
the neural networks ψ(t, x, y) p(t, x, y) and f (t, x, y) g(t, x, y) can be
trained by minimizing the mean squared error loss
N
1 X
|u(ti , xi , y i ) − ui |2 + |v(ti , xi , y i ) − v i |2

M SE := (10)
N i=1
N
1 X
|f (ti , xi , y i )|2 + |g(ti , xi , y i )|2 .

+
N i=1

Here we consider the prototype problem of incompressible flow past a circular

cylinder; a problem known to exhibit rich dynamic behavior and transitions

2
This construction can be generalized to three dimensional problems by employing the
notion of vector potentials.

7
for different regimes of the Reynolds number Re = u∞ D/ν. Assuming a
non-dimensional free stream velocity u∞ = 1, cylinder diameter D = 1, and
kinematic viscosity ν = 0.01, the system exhibits a periodic steady state
behavior characterized by a asymmetrical vortex shedding pattern in the
cylinder wake, known as the Kármán vortex street [14].

To generate a high-resolution data set for this problem we have employed

the spectral/hp-element solver NekTar [15]. Specifically, the solution domain
is discretized in space by a tessellation consisting of 412 triangular elements,
and within each element the solution is approximated as a linear combination
of a tenth-order hierarchical, semi-orthogonal Jacobi polynomial expansion
[15]. We have assumed a uniform free stream velocity profile imposed at the
left boundary, a zero pressure outflow condition imposed at the right bound-
ary located 25 diameters downstream of the cylinder, and periodicity for the
top and bottom boundaries of the [−15, 25] × [−8, 8] domain. We integrate
equation (6) using a third-order stiffly stable scheme [15] until the system
reaches a periodic steady state, as depicted in figure 2(a). In what follows,
a small portion of the resulting data-set corresponding to this steady state
solution will be used for model training, while the remaining data will be
used to validate our predictions. For simplicity, we have chosen to confine
our sampling in a rectangular region downstream of cylinder as shown in
figure 2(a).

Given scattered and potentially noisy data on the stream-wise u(t, x, y)

and transverse v(t, x, y) velocity components, our goal is to identify the un-
known parameters λ1 and λ2 , as well as to obtain a qualitatively accurate
reconstruction of the entire pressure field p(t, x, y) in the cylinder wake, which
by definition can only be identified up to a constant. To this end, we have
created a training data-set by randomly sub-sampling the full high-resolution
data-set. To highlight the ability of our method to learn from scattered and
scarce training data, we have chosen N = 5, 000, corresponding to a mere
1% of the total available data as illustrated in figure 2(b). Also plotted are
representative snapshots of the predicted velocity components u(t, x, y) and
v(t, x, y) after the model was trained. The neural network architecture used
here consists of 9 layers with 20 neurons in each layer.

A summary of our results for this example is presented in figure 3. We

observe that the physics informed neural network is able to correctly identify

8
Vorticity
3
2
5
1
0 0
y

−1
−5
−2
−3
−15 −10 −5 0 5 10 15 20 25
x

u(t, x, y) v(t, x, y)

y t y t

x x

Figure 2: Navier-Stokes equation: Top: Incompressible flow and dynamic vortex shedding
past a circular cylinder at Re = 100. The spatio-temporal training data correspond to
the depicted rectangular region in the cylinder wake. Bottom: Locations of training data-
points for the the stream-wise and transverse velocity components, u(t, x, y) and v(t, x, t),
respectively.

the unknown parameters λ1 and λ2 with very high accuracy even when the
training data was corrupted with noise. Specifically, for the case of noise-
free training data, the error in estimating λ1 and λ2 is 0.078%, and 4.67%,
respectively. The predictions remain robust even when the training data are
corrupted with 1% uncorrelated Gaussian noise, returning an error of 0.17%,
and 5.70%, for λ1 and λ2 , respectively.

A more intriguing result stems from the network’s ability to provide a

qualitatively accurate prediction of the entire pressure field p(t, x, y) in the
absence of any training data on the pressure itself. A visual comparison
against the exact pressure solution is presented in figure 3 for a represen-

9
Predicted pressure Exact pressure
2 2
1.4 0.0
1 1.3 1 −0.1
1.2 −0.2
0 0
y

y
1.1 −0.3
−1 1.0 −1 −0.4
−2 0.9 −2 −0.5
2 4 6 8 2 4 6 8
x x

ut + (uux + vuy ) = −px + 0.01(uxx + uyy )

Correct PDE
vt + (uvx + vvy ) = −py + 0.01(vxx + vyy )
ut + 0.999(uux + vuy ) = −px + 0.01047(uxx + uyy )
Identified PDE (clean data)
vt + 0.999(uvx + vvy ) = −py + 0.01047(vxx + vyy )
ut + 0.998(uux + vuy ) = −px + 0.01057(uxx + uyy )
Identified PDE (1% noise)
vt + 0.998(uvx + vvy ) = −py + 0.01057(vxx + vyy )

Figure 3: Navier-Stokes equation: Top: Predicted versus exact instantaneous pressure

field p(t, x, y) at a representative time instant. By definition, the pressure can be recov-
ered up to a constant, hence justifying the different magnitude between the two plots.
This remarkable qualitative agreement highlights the ability of physics-informed neural
networks to identify the entire pressure field, despite the fact that no data on the pressure
are used during model training. Bottom: Correct partial differential equation along with
the identified one obtained by learning λ1 , λ2 and p(t, x, y).

tative pressure snapshot. Notice that the difference in magnitude between

the exact and the predicted pressure is justified by the very nature of the
Navier-Stokes system, as the pressure field is only identifiable up to a con-
stant. This result of inferring a continuous quantity of interest from auxiliary
measurements by leveraging the underlying physics is a great example of the
enhanced capabilities that physics informed neural networks have to offer,
and highlights their potential in solving high-dimensional inverse problems.

Our approach so far assumes availability of scattered data throughout the

entire spatio-temporal domain. However, in many cases of practical interest,
one may only be able to observe the system at distinct time instants. In the
next section, we introduce a different approach that tackles the data-driven
discovery problem using only two data snapshots. We will see how, by lever-
aging the classical Runge-Kutta time-stepping schemes, one can construct
discrete time physics informed neural networks that can retain high predic-
tive accuracy even when the temporal gap between the data snapshots is

10
very large.

3. Discrete Time Models

We begin by applying the general form of Runge-Kutta methods with q
stages to equation (1) and obtain

un+ci = un − ∆t qj=1 aij N [un+cj ; λ], i = 1, . . . , q,

P
(11)
un+1 = un − ∆t qj=1 bj N [un+cj ; λ].
P

Here, un+cj (x) = u(tn + cj ∆t, x) for j = 1, . . . , q. This general form en-
capsulates both implicit and explicit time-stepping schemes, depending on
the choice of the parameters {aij , bj , cj }. Equations (11) can be equivalently
expressed as
un = uni , i = 1, . . . , q,
(12)
un+1 = un+1
i , i = 1, . . . , q.
where
Pq
uni := un+ci + ∆t j=1 aij N [un+cj ; λ], i = 1, . . . , q,
Pq (13)
un+1
i := un+ci + ∆t j=1 (aij − bj )N [u
n+cj
; λ], i = 1, . . . , q.

We proceed by placing a multi-output neural network prior on

n+c
u 1 (x), . . . , un+cq (x) .

(14)

This prior assumption along with equations (13) result in two physics in-
formed neural networks
n
u1 (x), . . . , unq (x), unq+1 (x) ,

(15)

and n+1
u1 (x), . . . , un+1 (x), un+1

q q+1 (x) . (16)
Given noisy measurements at two distinct temporal snapshots {xn , un } and
{xn+1 , un+1 } of the system at times tn and tn+1 , respectively, the shared
parameters of the neural networks (14), (15), and (16) along with the pa-
rameters λ of the differential operator can be trained by minimizing the sum

11
of squared errors
SSE = SSEn + SSEn+1 , (17)
where
q Nn
X X
SSEn := |unj (xn,i ) − un,i |2 ,
j=1 i=1

and
q Nn+1
X X
SSEn+1 := |un+1
j (xn+1,i ) − un+1,i |2 .
j=1 i=1
N N N
Here, xn = {xn,i }i=1
n
, un = {un,i }i=1
n
, xn+1 = {xn+1,i }i=1
n+1
, and un+1 =
Nn+1
{un+1,i }i=1 .

3.1. Example (Burgers’ Equation)

Let us illustrate the key features of this method through the lens of the
Burgers’ equation. Recall the equation’s form

ut + λ1 uux − λ2 uxx = 0, (18)

and notice that the nonlinear operator in equation (13) is given by

N [un+cj ] = λ1 un+cj un+c

x
j n+cj
− λ2 uxx .

Given merely two training data snapshots, the shared parameters of the
neural networks (14), (15), and (16) along with the parameters λ = (λ1 , λ2 )
of the Burgers’ equation can be learned by minimizing the sum of squared
errors in equation (17). Here, we have created a training data-set comprising
of Nn = 199 and Nn+1 = 201 spatial points by randomly sampling the exact
solution at time instants tn = 0.1 and tn+1 = 0.9, respectively. The training
data are shown in the top and middle panel of figure 4. The neural network
architecture used here consists of 4 hidden layers with 50 neurons each, while
the number of Runge-Kutta stages is empirically chosen to yield a temporal
error accumulation of the order of machine precision by setting3

q = 0.5 log / log(∆t), (19)

3
This is motivated by the theoretical error estimates for implicit Runge-Kutta schemes
suggesting a truncation error of O(∆t2q ) [16].

12
% error in λ1 % error in λ2
noise
0% 1% 5% 10% 0% 1% 5% 10%
∆t
0.2 0.002 0.435 6.073 3.273 0.151 4.982 59.314 83.969
0.4 0.001 0.119 1.679 2.985 0.088 2.816 8.396 8.377
0.6 0.002 0.064 2.096 1.383 0.090 0.068 3.493 24.321
0.8 0.010 0.221 0.097 1.233 1.918 3.215 13.479 1.621
Table 3: Burgers’ equation: Percentage error in the identified parameters λ1 and λ2 for
different gap size ∆t between two different snapshots and for different noise levels.

where the time-step for this example is ∆t = 0.8. The bottom panel of
figure 4 summarizes the identified parameters λ = (λ1 , λ2 ) for the cases of
noise-free data, as well as noisy data with 1% of Gaussian uncorrelated noise
corruption. For both cases, the proposed algorithm is able to learn the cor-
rect parameter values λ1 = 1.0 and λ2 = 0.01/π with remarkable accuracy,
despite the fact that the two data snapshots used for training are very far
apart, and potentially describe different regimes of the underlying dynamics.

A sensitivity analysis is performed to quantify the accuracy of our predic-

tions with respect to the gap between the training snapshots ∆t, the noise
levels in the training data, and the physics informed neural network archi-
tecture. As shown in table 3, the proposed algorithm is quite robust to both
∆t and the noise corruption levels, and it consistently returns reasonable
estimates for the unknown parameters. This robustness is mainly attributed
to the flexibility of the underlying implicit Runge-Kutta scheme to admit an
arbitrarily high number of stages, allowing the data snapshots to be very far
apart in time, while not compromising the accuracy with which the nonlin-
ear dynamics of equation (18) are resolved. This is the key highlight of our
discrete time formulation for identification problems, setting it apart from
competing approaches [7, 13]. Lastly, table 4 presents the percentage error
in the identified parameters, demonstrating the robustness of our estimates
with respect to the underlying neural network architecture.

3.1.1. Example (Kortewegde Vries Equation)

Our final example aims to highlight the ability of the proposed frame-
work to handle governing partial differential equations involving higher or-
der derivatives. Here, we consider a mathematical model of waves on shallow

13
u(t, x)
1.0
0.75
0.5 0.50
0.25
0.0 0.00
x

−0.25
−0.5 −0.50
−0.75
−1.0
0.0 0.2 0.4 0.6 0.8
t
t = 0.10 t = 0.90
199 trainng data 201 trainng data
1.0
0.5
0.5
u(t, x)

u(t, x)

0.0 0.0

−0.5
−0.5
−1.0
−1 0 1 −1 0 1
x x
Exact Data

Correct PDE ut + uux + 0.003183uxx = 0

Identified PDE (clean data) ut + 1.000uux + 0.003193uxx = 0
Identified PDE (1% noise) ut + 1.000uux + 0.003276uxx = 0

Figure 4: Burgers equation: Top: Solution u(t, x) along with the temporal locations of the
two training snapshots. Middle: Training data and exact solution corresponding to the
two temporal snapshots depicted by the dashed vertical lines in the top panel. Bottom:
Correct partial differential equation along with the identified one obtained by learning
λ1 , λ2 .

water surfaces; the Korteweg-de Vries (KdV) equation. This equation can
also be viewed as Burgers’ equation with an added dispersive term. The KdV
equation has several connections to physical problems. It describes the evolu-
tion of long one-dimensional waves in many physical settings. Such physical

14
% error in λ1 % error in λ2
Neurons
10 25 50 10 25 50
Layers
1 1.868 4.868 1.960 180.373 237.463 123.539
2 0.443 0.037 0.015 29.474 2.676 1.561
3 0.123 0.012 0.004 7.991 1.906 0.586
4 0.012 0.020 0.011 1.125 4.448 2.014
Table 4: Burgers’ equation: Percentage error in the identified parameters λ1 and λ2 for
different number of hidden layers and neurons in each layer.

settings include shallow-water waves with weakly non-linear restoring forces,

long internal waves in a density-stratified ocean, ion acoustic waves in a
plasma, and acoustic waves on a crystal lattice. Moreover, the KdV equa-
tion is the governing equation of the string in the Fermi-Pasta-Ulam problem
[17] in the continuum limit. The KdV equation reads as

ut + λ1 uux + λ2 uxxx = 0, (20)

with (λ1 , λ2 ) being the unknown parameters. For the KdV equation, the
nonlinear operator in equations (13) is given by

N [un+cj ] = λ1 un+cj un+c

x
j n+cj
− λ2 uxxx

and the shared parameters of the neural networks (14), (15), and (16) along
with the parameters λ = (λ1 , λ2 ) of the KdV equation can be learned by
minimizing the sum of squared errors (17).

To obtain a set of training and test data we simulated (20) using con-
ventional spectral methods. Specifically, starting from an initial condition
u(0, x) = cos(πx) and assuming periodic boundary conditions, we have inte-
grated equation (20) up to a final time t = 1.0 using the Chebfun package
[18] with a spectral Fourier discretization with 512 modes and a fourth-order
explicit Runge-Kutta temporal integrator with time-step ∆t = 10−6 . Using
this data-set, we then extract two solution snapshots at time tn = 0.2 and
tn+1 = 0.8, and randomly sub-sample them using Nn = 199 and Nn+1 = 201
to generate a training data-set. We then use these data to train a discrete
time physics informed neural network by minimizing the sum of squared error

15
loss of equation (17) using L-BFGS [12]. The network architecture used here
comprises of 4 hidden layers, 50 neurons per layer, and an output layer pre-
dicting the solution at the q Runge-Kutta stages, i.e., un+cj (x), j = 1, . . . , q,
where q is computed using equation (19) by setting ∆t = 0.6.

The results of this experiment are summarized in figure 5. In the top

panel, we present the exact solution u(t, x), along with the locations of the
two data snapshots used for training. A more detailed overview of the exact
solution and the training data is given in the middle panel. It is worth notic-
ing how the complex nonlinear dynamics of equation (20) causes dramatic
differences in the form of the solution between the two reported snapshots.
Despite these differences, and the large temporal gap between the two train-
ing snapshots, our method is able to correctly identify the unknown param-
eters regardless of whether the training data is corrupted with noise or not.
Specifically, for the case of noise-free training data, the error in estimating
λ1 and λ2 is 0.023%, and 0.006%, respectively, while the case with 1% noise
in the training data returns errors of 0.057%, and 0.017%, respectively.

4. Summary and Discussion

We have introduced physics informed neural networks, a new class of uni-
versal function approximators that is capable of encoding any underlying
physical laws that govern a given data-set, and can be described by par-
tial differential equations. In this work, we design data-driven algorithms
for discovering dynamic models described by parametrized nonlinear partial
differential equations. The inferred models allow us to construct computa-
tionally efficient and fully differentiable surrogates that can be subsequently
used for different applications including predictive forecasting, control, and
optimization.

Although a series of promising results was presented, the reader may per-
haps agree that this two-part treatise creates more questions than it answers.
In a broader context, and along the way of seeking further understanding of
such tools, we believe that this work advocates a fruitful synergy between
machine learning and classical computational physics that has the potential
to enrich both fields and lead to high-impact developments.

16
u(t, x)
1.0
2.0
0.5 1.5
1.0
0.0
x

0.5
0.0
−0.5
−0.5
−1.0 −1.0
0.0 0.2 0.4 0.6 0.8 1.0
t
t = 0.20 t = 0.80
199 trainng data 201 trainng data
1.0
2
0.5
u(t, x)

u(t, x)
1
0.0

−0.5 0

−1.0
−1 0 1 −1 0 1
x x
Exact Data

Correct PDE ut + uux + 0.0025uxxx = 0

Identified PDE (clean data) ut + 1.000uux + 0.0025002uxxx = 0
Identified PDE (1% noise) ut + 0.999uux + 0.0024996uxxx = 0

Figure 5: KdV equation: Top: Solution u(t, x) along with the temporal locations of the
two training snapshots. Middle: Training data and exact solution corresponding to the
two temporal snapshots depicted by the dashed vertical lines in the top panel. Bottom:
Correct partial differential equation along with the identified one obtained by learning
λ1 , λ2 .

Acknowledgements
This work received support by the DARPA EQUiPS grant N66001-15-
2-4055, the MURI/ARO grant W911NF-15-1-0562, and the AFOSR grant
FA9550-17-1-0013. All data and codes used in this manuscript are publicly

17
available on GitHub at [Link]

References
[1] A. Krizhevsky, I. Sutskever, G. E. Hinton, Imagenet classification with
deep convolutional neural networks, in: Advances in neural information
processing systems, pp. 1097–1105.

[2] Y. LeCun, Y. Bengio, G. Hinton, Deep learning, Nature 521 (2015)

436–444.

[3] B. M. Lake, R. Salakhutdinov, J. B. Tenenbaum, Human-level concept

learning through probabilistic program induction, Science 350 (2015)
1332–1338.

[4] B. Alipanahi, A. Delong, M. T. Weirauch, B. J. Frey, Predicting the se-

quence specificities of DNA-and RNA-binding proteins by deep learning,
Nature biotechnology 33 (2015) 831–838.

[5] M. Raissi, P. Perdikaris, G. E. Karniadakis, Numerical Gaussian pro-

cesses for time-dependent and non-linear partial differential equations,
arXiv preprint arXiv:1703.10230 (2017).

[6] M. Raissi, P. Perdikaris, G. E. Karniadakis, Inferring solutions of dif-

ferential equations using noisy multi-fidelity data, Journal of Computa-
tional Physics 335 (2017) 736–746.

[7] M. Raissi, G. E. Karniadakis, Hidden physics models: Machine

learning of nonlinear partial differential equations, arXiv preprint
arXiv:1708.00588 (2017).

[8] M. Raissi, P. Perdikaris, G. E. Karniadakis, Machine learning of linear

differential equations using Gaussian processes, Journal of Computa-
tional Physics 348 (2017) 683 – 693.

[9] S. H. Rudy, S. L. Brunton, J. L. Proctor, J. N. Kutz, Data-driven

discovery of partial differential equations, Science Advances 3 (2017).

[10] C. Basdevant, M. Deville, P. Haldenwang, J. Lacroix, J. Ouazzani,

R. Peyret, P. Orlandi, A. Patera, Spectral and finite difference solu-
tions of the Burgers equation, Computers & fluids 14 (1986) 23–41.

18
[11] A. G. Baydin, B. A. Pearlmutter, A. A. Radul, J. M. Siskind, Au-
tomatic differentiation in machine learning: a survey, arXiv preprint
arXiv:1502.05767 (2015).

[12] D. C. Liu, J. Nocedal, On the limited memory BFGS method for large
scale optimization, Mathematical programming 45 (1989) 503–528.

[13] S. L. Brunton, J. L. Proctor, J. N. Kutz, Discovering governing equa-

tions from data by sparse identification of nonlinear dynamical systems,
Proceedings of the National Academy of Sciences 113 (2016) 3932–3937.

[14] T. Von Kármán, Aerodynamics, volume 9, McGraw-Hill New York,

1963.

[15] G. Karniadakis, S. Sherwin, Spectral/hp element methods for computa-

tional fluid dynamics, Oxford University Press, 2013.

[16] A. Iserles, A first course in the numerical analysis of differential equa-

tions, 44, Cambridge University Press, 2009.

[17] T. Dauxois, Fermi, Pasta, Ulam and a mysterious lady, arXiv preprint
arXiv:0801.1590 (2008).

[18] T. A. Driscoll, N. Hale, L. N. Trefethen, Chebfun guide, 2014.

Physics-Informed Deep Learning for PDEs
100% (1)
Physics-Informed Deep Learning for PDEs
22 pages
Physics-Informed Neural Networks
100% (2)
Physics-Informed Neural Networks
22 pages
Journal of Computational Physics: M. Raissi, P. Perdikaris, G.E. Karniadakis
No ratings yet
Journal of Computational Physics: M. Raissi, P. Perdikaris, G.E. Karniadakis
22 pages
Accepted Manuscript: Journal of Computational Physics
No ratings yet
Accepted Manuscript: Journal of Computational Physics
47 pages
Physics-Informed Neural Networks M. Raissi & P. Perdikaris & G.E. Karniadakis Online Version
No ratings yet
Physics-Informed Neural Networks M. Raissi & P. Perdikaris & G.E. Karniadakis Online Version
98 pages
Improving Physics-Informed Neural Networks With Meta-Learned Optimization
No ratings yet
Improving Physics-Informed Neural Networks With Meta-Learned Optimization
26 pages
(Berg, Jens, and Kaj Nystrom), Data-Driven Discovery of PDEs in Complex Datasets, Journal of Computational Physics 384 (2019)
No ratings yet
(Berg, Jens, and Kaj Nystrom), Data-Driven Discovery of PDEs in Complex Datasets, Journal of Computational Physics 384 (2019)
14 pages
Renato - A Tutorial On Solving Ordinary Differential Equations Using Python and Hybrid Physics-Informed Neural Network
No ratings yet
Renato - A Tutorial On Solving Ordinary Differential Equations Using Python and Hybrid Physics-Informed Neural Network
11 pages
High Precision Derivatives for Nonlinear PDEs
No ratings yet
High Precision Derivatives for Nonlinear PDEs
28 pages
An Analysis of Universal Differential Equations For
No ratings yet
An Analysis of Universal Differential Equations For
10 pages
10 1016@j JCP 2019 05 027
No ratings yet
10 1016@j JCP 2019 05 027
17 pages
Thesis Pdepinns
No ratings yet
Thesis Pdepinns
67 pages
PDE-LEARN - Using Deep Learning To Discover PDE From Noisy, Limited Data
No ratings yet
PDE-LEARN - Using Deep Learning To Discover PDE From Noisy, Limited Data
25 pages
Pde-N: L PDE D: ET Earning S From ATA
No ratings yet
Pde-N: L PDE D: ET Earning S From ATA
17 pages
Physics-Informed NN Failures
No ratings yet
Physics-Informed NN Failures
20 pages
PDENet
No ratings yet
PDENet
9 pages
Nonlinear Evolutien Equation
No ratings yet
Nonlinear Evolutien Equation
15 pages
2022 Predicting Parametric Spatiotemporal Dynamics by Multi-Resolution PDE Structure-Preserved Deep Learning
No ratings yet
2022 Predicting Parametric Spatiotemporal Dynamics by Multi-Resolution PDE Structure-Preserved Deep Learning
51 pages
PDE-Net - Learning PDEs From Data
No ratings yet
PDE-Net - Learning PDEs From Data
9 pages
Modeling Systems With Machine Learning Based Differential Equations
No ratings yet
Modeling Systems With Machine Learning Based Differential Equations
12 pages
ML Methods for Solving PDEs
No ratings yet
ML Methods for Solving PDEs
59 pages
Artificial Neural Network Methods For The Solution of Second Order Boundary Value Problems
No ratings yet
Artificial Neural Network Methods For The Solution of Second Order Boundary Value Problems
15 pages
Robust Learning of Physics Informed Neural Network
No ratings yet
Robust Learning of Physics Informed Neural Network
20 pages
Solving Flows of Dynamical Systems by Deep Neural Networks and A Novel Deep Learning Algorithm
No ratings yet
Solving Flows of Dynamical Systems by Deep Neural Networks and A Novel Deep Learning Algorithm
12 pages
Neural Networks For Bifurcation and Linear Stability Analysis of Steady States in Partial Differential Equations
No ratings yet
Neural Networks For Bifurcation and Linear Stability Analysis of Steady States in Partial Differential Equations
34 pages
pde 微分方程与神经网络
No ratings yet
pde 微分方程与神经网络
16 pages
Deep Learning Meets PDEs: New Insights
No ratings yet
Deep Learning Meets PDEs: New Insights
5 pages
5 PINNs
No ratings yet
5 PINNs
29 pages
Using Neural Networks To Accelerate The Solution of The Boltzmann Equation
No ratings yet
Using Neural Networks To Accelerate The Solution of The Boltzmann Equation
20 pages
16992-Article Text-20486-1-2-20210518
No ratings yet
16992-Article Text-20486-1-2-20210518
9 pages
Practical Aspects On Solving Differential Equations Using Deep Learning
No ratings yet
Practical Aspects On Solving Differential Equations Using Deep Learning
32 pages
Journal of Computational Physics: Zichao Long, Yiping Lu, Bin Dong
No ratings yet
Journal of Computational Physics: Zichao Long, Yiping Lu, Bin Dong
17 pages
Pde-N 2.0: L PDE D A N - S H D N: ET Earning S From Ata With Umeric Ymbolic Ybrid EEP Etwork
No ratings yet
Pde-N 2.0: L PDE D A N - S H D N: ET Earning S From Ata With Umeric Ymbolic Ybrid EEP Etwork
18 pages
Stochastic Neural ODE Framework
No ratings yet
Stochastic Neural ODE Framework
35 pages
(Gin, Craig, Et Al.), Deep Learning Models For Global Coordinate Transformations That Linearize Pdes., Arxiv Preprint Arxiv-1911.02710 (2019) .
No ratings yet
(Gin, Craig, Et Al.), Deep Learning Models For Global Coordinate Transformations That Linearize Pdes., Arxiv Preprint Arxiv-1911.02710 (2019) .
27 pages
PDE1
No ratings yet
PDE1
29 pages
Sciadv Abi8605
No ratings yet
Sciadv Abi8605
10 pages
Graph Kernel Networks for PDEs
No ratings yet
Graph Kernel Networks for PDEs
21 pages
Physics-Informed Neural Networks - Wikipedia
No ratings yet
Physics-Informed Neural Networks - Wikipedia
17 pages
Neural ODES
No ratings yet
Neural ODES
32 pages
Deep Learning for p-Laplacian Equations
No ratings yet
Deep Learning for p-Laplacian Equations
15 pages
AI - Physics Informed Neural Network by ARNAB HALDER
100% (1)
AI - Physics Informed Neural Network by ARNAB HALDER
15 pages
Deep Learning for PDE Solutions
No ratings yet
Deep Learning for PDE Solutions
31 pages
Comparing PINNs and Finite Element Method
No ratings yet
Comparing PINNs and Finite Element Method
27 pages
A Feedforward Neural Network Framework For Approximating The Solutions To Nonlinear Ordinary Differential Equations
No ratings yet
A Feedforward Neural Network Framework For Approximating The Solutions To Nonlinear Ordinary Differential Equations
13 pages
P - N N N D F O: Hysics Informed Eural Etwork For Onlinear Ynamics in Iber Ptics
No ratings yet
P - N N N D F O: Hysics Informed Eural Etwork For Onlinear Ynamics in Iber Ptics
29 pages
2023-Physics-Informed Recurrent Neural Networks and Hyper-Parameter Optimization For Dynamic Process Systems
No ratings yet
2023-Physics-Informed Recurrent Neural Networks and Hyper-Parameter Optimization For Dynamic Process Systems
13 pages
Physics-Based Deep Learning
No ratings yet
Physics-Based Deep Learning
220 pages
nature综述
No ratings yet
nature综述
19 pages
De Florio PHD Thesis
No ratings yet
De Florio PHD Thesis
142 pages
Implementation of Pinns For Solving Differential Equations
No ratings yet
Implementation of Pinns For Solving Differential Equations
28 pages
Mathematics of Deep Learning Lecture Notes
100% (1)
Mathematics of Deep Learning Lecture Notes
58 pages
Neural Operators for Function Space Mapping
No ratings yet
Neural Operators for Function Space Mapping
93 pages
DPINN for CFD Numerical Approximation
No ratings yet
DPINN for CFD Numerical Approximation
76 pages
Three Ways To Solve Partial Differential Equations With Neural
No ratings yet
Three Ways To Solve Partial Differential Equations With Neural
32 pages
Master Thesis
No ratings yet
Master Thesis
94 pages
Physics-Based Deep Learning: N. Thuerey, P. Holl, M. Mueller, P. Schnell, F. Trost, K. Um
No ratings yet
Physics-Based Deep Learning: N. Thuerey, P. Holl, M. Mueller, P. Schnell, F. Trost, K. Um
287 pages
Data Communication
No ratings yet
Data Communication
43 pages
Linked List Data Structure
No ratings yet
Linked List Data Structure
27 pages
Company Career Page Links
No ratings yet
Company Career Page Links
3 pages
Hydraulic Hose Assembly Diagram
No ratings yet
Hydraulic Hose Assembly Diagram
53 pages
Cyclone Engine Spec Sheet 2021
No ratings yet
Cyclone Engine Spec Sheet 2021
4 pages
Cloud Catalogue 23 - Final
No ratings yet
Cloud Catalogue 23 - Final
9 pages
Boeing Questions
100% (1)
Boeing Questions
3 pages
Algorithm
No ratings yet
Algorithm
40 pages
PMDG 747 400 Queen of The Skies For FSX VERSION v2
No ratings yet
PMDG 747 400 Queen of The Skies For FSX VERSION v2
15 pages
New Perspectives On Microsoft Excel 2013 Comprehensive 1st Edition Test Bank
No ratings yet
New Perspectives On Microsoft Excel 2013 Comprehensive 1st Edition Test Bank
15 pages
ICT 8 - Tools and Equipment Used in ICT
100% (1)
ICT 8 - Tools and Equipment Used in ICT
2 pages
Role of Reception at Altqwa University
No ratings yet
Role of Reception at Altqwa University
22 pages
IEEE Xplore Citation Plain Text Download 2025.9.9.10.9.48
No ratings yet
IEEE Xplore Citation Plain Text Download 2025.9.9.10.9.48
3 pages
Al Haq Eye and General Hospital Haroonabad
No ratings yet
Al Haq Eye and General Hospital Haroonabad
1 page
ARUV578BTE5
No ratings yet
ARUV578BTE5
2 pages
Aluminum Profile Price List
No ratings yet
Aluminum Profile Price List
5 pages
Intertenancy Walls Hebel 1075 1076 1077
No ratings yet
Intertenancy Walls Hebel 1075 1076 1077
1 page
ECE 391 Exam 1, Spring 2016 Solutions
No ratings yet
ECE 391 Exam 1, Spring 2016 Solutions
18 pages
Optimized Casing Protection System Specs
No ratings yet
Optimized Casing Protection System Specs
1 page
Student Attendenct Nidhi
No ratings yet
Student Attendenct Nidhi
33 pages
Hot Water Calorifier Manual
No ratings yet
Hot Water Calorifier Manual
18 pages
Mathematics SBA Guide
No ratings yet
Mathematics SBA Guide
15 pages
Curriculum Innovation - Introd & Models
No ratings yet
Curriculum Innovation - Introd & Models
5 pages
PTM2.5T Specsheet
No ratings yet
PTM2.5T Specsheet
3 pages
Object-Oriented Programming Basics
No ratings yet
Object-Oriented Programming Basics
85 pages
Communication in Professional Life
No ratings yet
Communication in Professional Life
46 pages
Cateye Kinetic TL-NW100K
No ratings yet
Cateye Kinetic TL-NW100K
15 pages
Blind Bolt Catalog Fy 02
No ratings yet
Blind Bolt Catalog Fy 02
54 pages
TEXTRON Tech Manual Search
No ratings yet
TEXTRON Tech Manual Search
2 pages
Always Choose Stäubli Original Equipment
No ratings yet
Always Choose Stäubli Original Equipment
2 pages

Physics-Informed Deep Learning

Uploaded by

Physics-Informed Deep Learning

Uploaded by

Physics Informed Deep Learning (Part II): Data-driven

Discovery of Nonlinear Partial Differential Equations

Providence, RI, 02912, USA

Preprint submitted to Journal Name November 30, 2017

ut + N [u; λ] = 0, x ∈ Ω, t ∈ [0, T ], (1)

2. Continuous Time Models

f := ut + N [u; λ], (2)

and proceed by approximating u(t, x) by a deep neural network. This as-

2.1. Example (Burgers’ Equation)

ut + λ1 uux − λ2 uxx = 0. (3)

Let us define f (t, x) to be given by

f := ut + λ1 uux − λ2 uxx , (4)

Here, {tiu , xiu , ui }N

To illustrate the effectiveness of our approach, we have created a train-

To further scrutinize the performance of our algorithm, we have performed

t = 0.25 t = 0.50 t = 0.75

Correct PDE ut + uux − 0.0031831uxx = 0

methodology appears to be very robust with respect to noise levels in the

2.1.1. Example (Navier-Stokes Equation)

ut + λ1 (uux + vuy ) = −px + λ2 (uxx + uyy ),

of the velocity field, we are interested in learning the parameters λ as well as

f := ut + λ1 (uux + vuy ) + px − λ2 (uxx + uyy ),

Here we consider the prototype problem of incompressible flow past a circular

To generate a high-resolution data set for this problem we have employed

Given scattered and potentially noisy data on the stream-wise u(t, x, y)

A summary of our results for this example is presented in figure 3. We

A more intriguing result stems from the network’s ability to provide a

ut + (uux + vuy ) = −px + 0.01(uxx + uyy )

Figure 3: Navier-Stokes equation: Top: Predicted versus exact instantaneous pressure

tative pressure snapshot. Notice that the difference in magnitude between

Our approach so far assumes availability of scattered data throughout the

3. Discrete Time Models

un+ci = un − ∆t qj=1 aij N [un+cj ; λ], i = 1, . . . , q,

We proceed by placing a multi-output neural network prior on

3.1. Example (Burgers’ Equation)

ut + λ1 uux − λ2 uxx = 0, (18)

and notice that the nonlinear operator in equation (13) is given by

N [un+cj ] = λ1 un+cj un+c

q = 0.5 log / log(∆t), (19)

A sensitivity analysis is performed to quantify the accuracy of our predic-

3.1.1. Example (Kortewegde Vries Equation)

Correct PDE ut + uux + 0.003183uxx = 0

settings include shallow-water waves with weakly non-linear restoring forces,

ut + λ1 uux + λ2 uxxx = 0, (20)

N [un+cj ] = λ1 un+cj un+c

The results of this experiment are summarized in figure 5. In the top

4. Summary and Discussion

Correct PDE ut + uux + 0.0025uxxx = 0

[2] Y. LeCun, Y. Bengio, G. Hinton, Deep learning, Nature 521 (2015)

[3] B. M. Lake, R. Salakhutdinov, J. B. Tenenbaum, Human-level concept

[4] B. Alipanahi, A. Delong, M. T. Weirauch, B. J. Frey, Predicting the se-

[5] M. Raissi, P. Perdikaris, G. E. Karniadakis, Numerical Gaussian pro-

[6] M. Raissi, P. Perdikaris, G. E. Karniadakis, Inferring solutions of dif-

[7] M. Raissi, G. E. Karniadakis, Hidden physics models: Machine

[8] M. Raissi, P. Perdikaris, G. E. Karniadakis, Machine learning of linear

[9] S. H. Rudy, S. L. Brunton, J. L. Proctor, J. N. Kutz, Data-driven

[10] C. Basdevant, M. Deville, P. Haldenwang, J. Lacroix, J. Ouazzani,

[13] S. L. Brunton, J. L. Proctor, J. N. Kutz, Discovering governing equa-

[14] T. Von Kármán, Aerodynamics, volume 9, McGraw-Hill New York,

[15] G. Karniadakis, S. Sherwin, Spectral/hp element methods for computa-

[16] A. Iserles, A first course in the numerical analysis of differential equa-

[18] T. A. Driscoll, N. Hale, L. N. Trefethen, Chebfun guide, 2014.

You might also like

q = 0.5 log / log(∆t), (19)