Energy-Based PDE Solutions with DNNs
Energy-Based PDE Solutions with DNNs
b
Institute of Structural Mechanics, Bauhaus-Universität Weimar, Marienstraße 15 99423 Weimar
c
Institute of Continuum Mechanics, Leibniz Universität Hannover, Appelstraße 11, 30167 Hannover,
Germany
Abstract
Partial Differential Equations (PDE) are fundamental to model different phenomena in sci-
ence and engineering mathematically. Solving them is a crucial step towards a precise knowl-
edge of the behaviour of natural and engineered systems. In general, in order to solve PDEs
that represent real systems to an acceptable degree, analytical methods are usually not
enough. One has to resort to discretization methods. For engineering problems, probably
the best known option is the finite element method (FEM). However, powerful alternatives
such as mesh-free methods and Isogeometric Analysis (IGA) are also available. The funda-
mental idea is to approximate the solution of the PDE by means of functions specifically
built to have some desirable properties. In this contribution, we explore Deep Neural Net-
works (DNNs) as an option for approximation. They have shown impressive results in areas
such as visual recognition. DNNs are regarded here as function approximation machines.
There is great flexibility to define their structure and important advances in the architec-
ture and the efficiency of the algorithms to implement them make DNNs a very interesting
alternative to approximate the solution of a PDE. We concentrate in applications that have
an interest for Computational Mechanics. Most contributions that have decided to explore
this possibility have adopted a collocation strategy. In this contribution, we concentrate in
mechanical problems and analyze the energetic format of the PDE. The energy of a mechan-
ical system seems to be the natural loss function for a machine learning method to approach
a mechanical problem. As proofs of concept, we deal with several problems and explore the
capabilities of the method for applications in engineering.
Keywords: Physics informed, Deep neural networks, Energy approach
∗
Corresponding author.
Email address: [Link]@[Link] (T. Rabczuk)
1. Introduction
Computational mechanics aims at solving mechanical problems using computer methods.
These mechanical problems can originate from the study of either natural or engineered
systems. In order to describe their behaviour in a precise manner, mathematical models
have to be devised. In engineering applications, these mathematical models are often based
on partial differential equations (PDEs). When realistic models are considered, one has to
resort to numerical methods to solve them. The idea is to look for an approximate solution
for the problem, in a finite-dimensional space. Then, the problem reduces to find a finite set
of parameters that define this approximate solution. Conventional ways to tackle the solution
of PDEs are the finite element method (FEM) [1], mesh-free methods [2], and isogeometric
analysis [3].
In recent times, the use of deep neural networks (DNNs) have led to outstanding achieve-
ments in several areas, such as visual recognition [4]. Feed-forward neural networks are de-
vised to approximate the target functions. The approximating functions depend on certain
parameters (the weights and the biases) that have to be ”learned” by means of a ”training”
process. Then, it is conceivable that Neural Networks may be used to approximate the so-
lution of a PDE. This is the perspective that we are going to adopt with respect to DNNs
in this article: they are function approximation machines.
The success of DNNs in important learning tasks can be related to the development of
very powerful computational tools. Libraries such as Tensorflow [5] and PyTorch [6] provide
the building blocks to devise learning machines for very different problems. Interfaces that
allow to code in a readable languages like Python and the availability of optimized numer-
ical algorithms leads to achieving a near-mathematical notation, reminiscent of computing
plataforms like FEniCS [7].
There are some works in which the idea of using DNNs to solve PDEs has been pursued
[8]. However, in general, they have dealt with the strong form of the PDE, leading to
an approach based on collocation [9]. Then, the training process is based in devising an
objective function, the empirical loss function, whose minimization leads to the fulfilment of
the governing equations. Also, the problems dealt with are not in general problems directly
related to engineering applications.
A way to approach the solution of a PDE is to write it as a variational problem [10]. From
the mechanical point of view, the corresponding functional has the meaning of an energy.
Given the fact that the training process in machine learning can be regarded as a process of
minimizing the loss function, it seems natural to regard the energy of the system as a very
good candidate for this loss function. In addition, the near-mathematical syntax achieved
by the platforms associated to Tensorflow and Pytorch imply a high degree of readability
and ease of implementation of the PDE solver. Moreover, once the variational problem is
approximated in a finite-dimensional space, it becomes an optimization problem, which is
very convenient given that machine learning libraries are specially oriented to optimization
techniques.
It is worth mentioning that our approach points at solving the PDEs by means of DNNs
as an approximation strategy. In that respect, what we propose is different from approaches
such as [11]. They use labeled data from numerical simulations (although it could be obtained
from experiments, in principle) to help the solution of a Boundary Value Problem in some
2
specific aspect, where detailed knowledge of the phenomena that is being modelled is lacking.
For instance, [12] replaces the constitutive model for a data-driven model. In summary, they
use machine learning to build a surrogate model, while we use it to build the approximation
space.
In this work, we explore the possibility of using a DNN based solver for PDEs. The paper
is organized as follows. In Section 2, we introduce the reader to the generalized problem
setup. Section 3 provides a brief introduction to DNN. The strategy for solving PDEs with
DNNs based on collocation method and deep energy method is explained in Section 4. The
implementation technique is explained in Section 5. To explain the implementation, we have
used snippets from the code. Some representative applications in computational mechanics
are tackled in Section 6, to explore the possibilities of this approach. Finally, Section 7
concludes the study by summarizing the key results of the present work.
3
which for sufficiently smooth u can be solved by a collocation-type method as detailed in
Section 4.
where u is constrained by Dirichlet boundary conditions. Here, we will assume that the
total variational energy of the system, E is such that there exists a unique solution of the
problem defined in Eq. (6) and is the same as the solution of Eq. (1). In that case, Eq. (1)
is called the Euler-Lagrange equation of the variational problem defined in Eq. (6).
The first step to go from the variational energy formulation to the corresponding strong
form is to find a stationary state of E by equating its first variation to zero:
d
δE[u, δu] = E[u + τ δu] = 0, (7)
dτ τ =0
which has to hold for any admissible δu (i.e., for any smooth enough function having homoge-
neous boundary conditions on the Dirichlet part of the boundary). The resulting expression
is the so-called weak form of the problem defined by Eq. (1) plus appropriate boundary
conditions. In mechanical problems, this corresponds to the principle of virtual work. The
weak form is the point of departure of important discretization methods such as the FEM.
A characteristic feature of the approach used in this contribution is that it deals directly
with the minimization of the variational energy, E circumventing the need to derive the weak
form from the energy of the system explicitly.
In order to illustrate this approach, let us again consider a linear elastic body. One can
define the stored elastic strain energy, Ψ() as:
1
Ψ() = : C : . (8)
2
Notice that the Cauchy stress tensor, σ() can be computed as follows:
∂Ψ
σ() = . (9)
∂
Then, for the case of zero body forces and homogeneous Neumann boundary conditions,
we can define the energy of the solid as
Z
E[u] = Ψ (u) dΩ. (10)
Ω
4
Given an input x, an ANN would map it to an output up (x). The functional structure
of the ANN is defined by a set of parameters (which can be collected in a vector, p). Deep
Neural Networks are ANNs obtained by means of the composition of simple functions:
up (x) = AL (σ(AL−1 (σ(. . . σ(A1 (x)) . . .)))), (11)
where Al (with l = 1, 2, . . . , L) are affine mappings and σ is a the activation function applied
element-wise. It is important to note that the non-linearity of the DNNs come from the
activation function. Due to the graphical representation of the composition, each of these
functions is called a layer, where L denotes the number of layers of the network. So, loosely
speaking, we say that an ANN is a DNN with one layer, barring the input and the output
layers..
Each affine mapping Al can be defined by a (in general not square) matrix and a vector.
The elements of the matrix are called weights and the elements of the vector are called biases.
They together parametrize the function up (x), and can be grouped in a vector p. In the
following, we consider feed-forward fully-connected DNNs, where each neuron is connected
to all the neurons in the adjacent layers. Other types of neural networks can be considered
as well, such as convolutional neural networks [13], where the layers and the connections
between them are organized in an hierarchical structure.
Once the architecture of the network has been decided, that is, once the number of layers,
the number of neurons per layer, and the activation functions have been frozen, defining
up (x) boils down to determining p. The process of determining these parameters (the
weights and the biases) is called training the network. Usually, when solving problems
with machine learning, this entails having a large amount of input and output data. In the
approach used here, data for training is generated from evaluations of the PDEs at given
point in the domain, as well as from certain points on the boundary. This issue is explained
more in detail below.
4. Solution strategies
4.1. Collocation Methods
Collocation methods are based on the solution of the BVP in strong form. Given a set of
points {x1 , x2 , . . . , xn } in the domain Ω, the idea is to evaluate N [up ] at each point xi and
construct a loss function whose optimization tends to impose the condition N [up ](xi ) = 0.
For instance, loss functions based on the mean square error of N [up ](xi ) can be used.
5
where up (xi ) is the displacement approximation function evaluated at the integration point,
xi and its corresponding weight, wi . A key issue is the prescription of the boundary condi-
tions. In an variational energy formulation, Neumann boundary conditions can in principle
be imposed in a natural way. However, Dirichlet boundary conditions have to be imposed
in a somehow more involved way, since DNN approximation functions do not fulfill the
Kronecker delta property. We will address these issues in Section 6.
In machine learning, first order methods are almost universal. In fact, one may say that most
optimization strategies in machine learning are variants of the so-called gradient descent
method. A typical iteration of this method is the following:
!
X
pk+1 = pk + γk ∇p fi (p) , (15)
i
where ukp is the DNN approximation of the unknown u parametrized by vector pk , which
groups all weights and biases at iteration k. In addition, γk stands for a parameter called
learning rate. Several variations of this algorithm can be considered. First of all, not all the
points are always considered. In fact, stochasticity in the selection of the points considered
in the finite sum is a general practice, resulting in the so-called stochastic gradient descent
(SGD) algorithm. Acceleration strategies are also very common. Another very common
variant implies the addition of a penalty term to avoid unbounded values of the parameters.
Sometimes, even some approximation of second order information is included, for example,
by means of the L-BFGS method, a quasi-Newton method.
One important issue related to the optimization problem is the structure of the search
space. Recent mathematical results describe the presence of highly unfavorable properties of
the topological structure of DNN spaces. This means that, although the space of functions
generated by a DNN can be very expressive, the structure of the search space may cause
difficulties for optimization algorithms [16]. However, some unexpected properties of the
gradient-descent-based algorithms seem to be able to overcome these difficulties [17], at
least for some applications. This an issue that is the object of intense research currently.
5. Implementation
In this section, we illustrate how the proposed DEM approach can be used for solving
problems in the field of continuum mechanics. The primary goal of the approach is to
6
obtain the field variables by solving the governing partial differential equation, which in
turn requires to compute the integral Eq. (10). We do this by using machine learning
tools available in open-source libraries like TensorFlow and PyTorch. This allows us to use
collective knowledge accumulated in a very vibrant community.
We describe the typical steps that one would have to follow for a TensorFlow implemen-
tation. The idea is to illustrate the general procedure. Specific details for each application
are given below. The full source code will be published on GitHub or can be obtained by
contacting the authors.
First, the geometry is generated using uniformly spaced points in the whole domain and
obtain the training points, Xf and prediction points, xpred . Before we start to train the
network, we initialize the weights of the network randomly from a Gaussian distribution
using the Xavier initialization technique [18]. The network is initialized in the following
manner:
def initialize_NN ( self , layers ) :
weights = []
biases = []
num_layers = len ( layers )
for l in range (0 , num_layers - 1) :
W = self . xavier_init ( size =[ layers [ l ] , layers [ l + 1]])
b = tf . Variable ( tf . zeros ([1 , layers [ l + 1]]) ,
dtype = tf . float32 )
weights . append ( W )
biases . append ( b )
return weights , biases
7
num_layers = len ( weights ) + 1
6. Applications
In this section, we explore the application of DEM to solve PDEs in various domains of
continuum mechanics. The first section considers a one-dimensional problem and the results
obtained using DEM is compared to the available analytical solution. Later, we discuss the
application of DEM on linear elasticity, hyperelasticity. In the latter part of the section, we
solve PDEs for coupled problems. For coupled problems, we address phase-field modeling of
fracture, piezoelectricity and lastly the bending of a Kirchhoff plate.
The implementation has been carried out using the TensorFlow framework [19]. Details
on the network architecture, such as number of layers, number of neurons in each layer, the
activation function used etc. have been provided with each example.
8
These results have been extended to more general Finite Element spaces using Sobolev norms
in [21]. Here we illustrate the relationship between FE approximations and DNNs by means
of a one-dimensional example, as done in [20].
Let us consider a discretization of the interval [0, l] in the real line: x0 < x1 < . . . < xn ,
where x0 = 0 and xn = l. Then, consider a piecewise linear FE approximation of a function
u(x) taking values on [0, l]:
n
X
uh (x) = ui Ni (x) (16)
i=0
where Ni (x) is a linear hat function with compact support in the subinterval [xi−1 , xi+1 ], such
that Ni (xi ) = 1. For i = 0 and i = n, Ni (x) has compact support in [x0 , x1 ] and [xn−1 , xn ],
respectively. Function Ni (x), for intermediate nodes, can be expressed in terms of rectified
linear unis ReLU as follows:
with
∆i+1 ∆i
wi = − (21)
hi+1 hi
Notice that uh is parametrized by the weights wi , which can be grouped in a vector w,
and by the position of the nodes xi (the biases of the hidden layer), which can be grouped
in the vector xp . For fixed xp , and prescribing u0 , the vector value of the elements of w is
defined by the parameters ui , i.e., the values of uh evaluated at the nodes xi .
Consider the approximation in Eq. (20). Then, if Ψ ) is the one-dimensional linear elastic
energy density, we can define an energy
Z l
Ψ (uh (x; w, xp )) dx.
E(w, xp ) = (22)
0
9
in terms of the weights vector w and a vector containing the position of the nodes, xp :
For fixed xp , the problem would be equivalent to solving a standard FE problem (remember
the relationship between w and the nodal FE unknowns). When xp is not fixed, from the
mathematical point of view, we would be solving an adaptive FE problem. However, since
the problem is now non-linear (and probably non-convex), there may be issues related to the
use of typical machine learning algorithms to optimize the loss function defined in 23.
Although this 1D problem is illustrative of the meaning of the parameters of a DNN
approximation, it is also clear that, for general situations, assigning physical meaning to
these parameters would be very difficult. This is may be a consequence of the mesh-free
character of DNN approximations for the solution of PDEs. This has several implications.
Collocation points, for instance, are not to be confused with nodes in a Finite Element mesh.
Points in a FE mesh are tightly related to the parameters of the approximation space, while
collocation points are the inputs for the training process in the DNN approximation.
10
6.2.1. Pressurized thick-cylinder
The first example considers the benchmark problem of of a thick cylinder subjected to
internal pressure under plane stress condition [22]. Due to symmetry in the geometry and
boundary conditions, only a quarter section represented by an annulus is analyzed. The
cylinder is subjected to internal pressure applied on the inner circular edge. The geometrical
setup and the boundary conditions are shown in Fig. 1(a). In this example, we have consid-
ered E = 1 × 105 and ν = 0.3. The analytical solution, in terms of the stress components,
is [23]:
Ra2 P Ra2
σrr = 2 1− 2 ,
Ra − Rb2 r
2
Ra2
Ra P
σθθ = 2 1+ 2 ,
Ra − Rb2 r
σrθ = 0, (25)
2 !
2
Ra P r Rb
urad = 2 2
1−ν+ (1 + ν) ,
E(Rb − Ra ) r
uexact = urad cos θ,
vexact = urad cos θ,
where Ra and Rb represents the inner and the outer radius of the cylinder, respectively with
Ra = 1 and Rb = 4. P is the pressure exerted along the inner circular edge. In Eq. (25), uexact
and vexact are the analytical solution of the displacement field in x and y-axis, respectively.
The minimization problem reads as stated in Eq. (65). The Dirichlet and Neumann boundary
conditions for the thick cylinder under internal pressure are:
u(0, y) = v(x, 0) = 0,
tN,x (Ra , θ) = P u, (26)
tN,y (Ra , θ) = P v,
where u and v are the solutions of the elastic field in x and y-axis, respectively and tN (Ra , θ)
denotes the traction force which is expressed in terms of the applied internal pressure and
the displacement field. To find the solution of the displacement field, the trial solution is
defined as:
u = xû,
(27)
v = yv̂,
where û and v̂ are obtained from the neural network. The solution of u and v are chosen
in such a way that it satisfies all the boundary conditions as in [24, 25]. As a consequence,
no component corresponding to the boundary loss is needed in the loss function. This
significantly simplifies the objective function to be minimized. We have used a network with
3 hidden layers of 30 neurons each and Nx × Ny uniformly spaced points in the interior of the
annulus, with Nx = Ny = 80. The inner circular edge is discretized using NBound uniformly
spaced points to apply the internal pressure in terms of traction force, with NBound = 80.
For the input layer and the first two hidden layers, we have considered rectified linear units,
ReLU2 activation function; whereas for the last layer connected to the output layer, linear
11
(a) Geometrical setup and boundary
conditions (b) Displacement in x-axis.
Figure 1: Setup and numerical results for the pressurized thick cylinder example
12
(a) Error in the displacement field in x-axis. (b) Error in the displacement field in y-axis.
where Vexact corresponds to the results obtains using the analytical solution of the field
variable, V and Verr = Vexact − Vpred , where Vpred is obtained from the neural network.
Corresponding to the displacement and the energy norms, relative prediction errors of 0.5%
and 3.7%, respectively have been observed. Fig. 2 shows the errors in the displacement field.
13
Figure 3: Problem setup for the plate with a circular hole example
14
(a) Displacement in x-axis. (b) Displacement in y-axis.
(c) Error in the displacement field in x- (d) Error in the displacement field in y-
axis. axis.
15
(a) Hollow sphere under internal pres-
sure. (b) Cube with a spherical hole.
Figure 5: Geometrical Setup and boundary condition for the application of linear elasticity on three-
dimensional structure.
where û, v̂ and ŵ are obtained from the neural network. To obtain the displacement field,
we have used a network with 3 hidden layers of 50 neurons each and Nx × Ny × Nz uniformly
spaced points in the interior of the domain, Nx = Ny = 80 and Nz = 10. The boundary
face on which the internal pressure, in terms of traction force, is applied is discretized into
NBound points, with NBound = 1600. For the first two layers, we have considered rectified
linear units, ReLU2 activation function; whereas for the last layer, linear activation function
has been considered. For a three-dimensional problem, the loss function is computed as
Lelastic = Lint − Lneu ,
Nx ×Ny ×Nz
VΩ X
Lint = f (xi ),
Nx × Ny × Nz i=1
NBound
V∂Ω X (37)
Lneu = fneu (xi ),
Nbound i=1
1
f (x) = : C : ,
2
fneu (x) = tN,x (x) + tN,y (x) + tN,z (x),
where Lint and Lneu defines the losses in the elastic strain energy and the Neumann boundary
loss, respectively and VΩ is the volume of the analyzed domain. The predicted displacement
field and the strain energy, using the DEM approach is shown in Fig. 6.
Similar to the previous examples, the L2 error is computed to check the accuracy of the
solution. Corresponding to u and strain energy, a prediction error of 1.05% and 4.44%,
respectively have been observed.
16
(a) Displacement. (b) Strain energy.
Figure 6: Computed solution for the hollow sphere subjected to internal pressure
hole is analyzed as shown in Fig. 5(b). The analytical stresses in spherical coordinates for
this are [28]:
3
6a5
2 S a 2 2
σrr = S cos θ + (6 − 5 cos θ(5 − ν)) + 5 (3 cos θ − 1) ,
7 − 5ν r3 r
3 5
3S a 2 a 2
σφφ = (5ν − 2 + 5 cos θ(1 − 2ν)) + 5 (1 − 5 cos θ) ,
2(7 − 5ν) r3 r
3 (38)
3a5
2 S a 2 2
σθθ = S sin θ + 4 − 5ν + 5 cos θ (1 − 2ν) + 5 (3 − 7 cos θ) ,
2(7 − 5ν) r3 r
5 3
1 12a 5a (1 + ν)
σrθ = S −1 + 5
− ,
7 − 5ν r r3
where S denotes the applied uniaxial tension and a is the radius of the spherical hole. The
material properties considered for this example are E = 1 × 103 and ν = 0.3. The Dirichlet
boundary conditions are:
u(x, y, 0) = v(x, y, 0) = w(x, y, 0) = 0, (39)
where u, v and w are the solutions of the elastic field in x, y, and z -axis, respectively. To
ensure that the boundary conditions are exactly satisfied, we use the same trial function
for the displacement field as stated in Eq. (36). For obtaining the displacement, we have
used a fully connected neural network with 3 hidden layers of 50 neurons each. Similar to
previous example, for the first two layers ReLU2 activation function and for the last layer
linear activation function have been used. The domain is discretized into NInt points in the
interior of the domain, with NInt = 32000. The boundary face on which the internal pressure
in terms of traction force is applied is discretized into NBound points, with NBound = 1600.
17
(a) Displacement (b) Strain energy
6.3. Elastodynamics
To investigate the approximation properties of neural networks for time-dependent prob-
lems, we consider a wave propagation example. In one dimension, the governing (equilibrium)
equation is of the form:
∂ 2u 2
2∂ u
= c for x ∈ Ω := (a, b) and t ∈ (0, T ) (41)
∂t2 ∂x2
together with the initial conditions
u(x, 0) = u0 (x) and ut (x) = v0 (x) for x ∈ Ω (42)
18
(a) Computed and exact displacement (b) Computed and exact velocity
Figure 8: Comparison between the computed and exact solutions for the 1D wave propagation example
19
corresponding to the equilibrium equation together with the initial and boundary conditions
as:
L(p) = Lresid (p) + Linit (p) + Lbnd (p), (47)
where
Nint
1 X
Lresid (p) = (ûtt (xint int 2 int int
i , ti ; p) − c ûxx (xi , ti ; p))
2
(48)
Nint i=1
Ninit
1 X
(û(xinit init 2 init init 2
Linit (p) = j , 0; p) − u0 (xj )) + (ût (xj , 0; p) − v0 (xj )) (49)
Ninit j=1
Nbnd
1 X
(ûx (0, tbnd bnd 2 bnd bnd 2
Lbnd (p) = k ; p) − ū(tk )) + (û(L, tk ; p) − g(tk )) . (50)
Nbnd k=1
Here û(x, t; p) is the neural network approximation to u(x, t) which is determined by the pa-
rameters p obtained by minimizing the loss function. Moreover, (xint int
i , ti ) for i = 1, . . . , Nint
are interior collocation points, xinit
j with j = 1, . . . , Ninit are the points in the space do-
bnd
main and tk , k = 1, . . . , Nbnd are points in time corresponding to the initial and boundary
conditions respectively.
The approximation obtained with Nint = 1992 ,and Ninit = 199, and Nbnd = 200 collocation
points as well as the errors in the computed displacement and velocity are shown in Figure
8. The relative errors in L2 norm for the displacement and velocity are 1.422382 · 10−3 and
5.954601 · 10−3 respectively.
6.4. Hyperelasticity
In the context of the elastostatics at finite deformation (shown in Figure 9), the equilibrium
equation along with the boundary conditions, for an initial configuration, is given as:
Equilibrium: ∇ · P + f b = 0, (51)
Dirichlet boundary : u = u on ∂ΩD , (52)
Neumann boundary : P · n = t on ∂ΩN , (53)
in which u and t are prescribed values on Dirichlet boundary and Neumann boundary,
respectively. Therein, the boundaries have to fulfill ∂ΩD ∪ ∂ΩN = ∂Ω, ∂ΩD ∩ ∂ΩN = ∅.
The outward normal unit vector is denoted by n. The 1st Piola-Kirchhoff stress tensor P
is related to its power conjugate F , so-called deformation gradient tensor, by a constitutive
equation P = ∂Ψ/∂F . The deformation gradient is defined as follows
F = Grad ϕ(X), (54)
where ϕ denotes the mapping of material points in the initial configuration to the current
configuration. It is defined as:
ϕ(X) := x = X + u.
Since the strain energy density Ψ describing the elastic energy stored in the body is explicitly
known, it is more convenient to find the possible deformations in a form of potential energy
20
ϕ(X)
t
Initial configuration Current configuration
u(X)
Ω
ϕ(Ω)
∂Ω X
x(X) ϕ(∂Ω)
e2 e3
e1
O
Figure 9: Motion of body B.
where H is the set of admissible functions (trial functions). The method is named the
principle of minimum total potential energy.
Now that our objective is to minimize the potential energy with the aid of neural networks,
we need to cast the minimization into the optimization problem in the context of machine
learning. Therefore, the definition of a loss function is essential. In this case we exploit the
potential energy as the loss function. It reads
Z Z Z
L(p) = Ψ(ϕ(X; p)) dV − f b · ϕ(X; p) dV − t · ϕ(X; p) dA, (57)
Ω Ω ∂ΩN
21
u|Γ0
u|Γ1
H=1
W =1
L = 1.25
22
Description Value
With the prescribed boundary conditions, the displacement has the form
X1
up (X) = u|Γ1 + (X1 − 1.25) X1 u
b (X; p). (63)
1.25
Setup. We generate 64000 equidistant grid points (N1 = 40, N2 = 40, N3 = 40) over the
entire domain as the feeding data to train our network (Figure 11). A 5-layer network
(3 − 30 − 30 − 30 − 3) is constructed. Therein, we use 3 neurons in the input layer for
the nodal coordinates and 3 neurons in the output layer for the unconstrained solution.
Moreover, in each hidden layer 30 neurons are used to enforce the learning behavior. We use
tanh activation function to evaluate neural values in the hidden layers. The learning rate
r = 0.5 is used in the L-BFGS optimizer. The neural network is strained for 50 steps.
Figure 11: The training point distribution of a 3D hyperelastic cuboid in DEM. The red points correspond
to the first Dirichlet boundary condition. The points data are used to enforce/train the second Dirichlet
boundary condition. The black points are used to train the surface force. The blue points are used for the
interior domain integration and the body force.
Result. Since the analytical solutions are not available in this example, we use a finite ele-
ment program to simulate the Cuboid with a fine mesh to obtain the reference solution. We
23
first measure the displacements and stresses along the AB line depicted in the Figure 12a.
The result in general reveals a good agreement between DEM solution and the reference
solution in Figure 13. It is found that the deep energy method exhibits good solutions in
terms of displacements. However, there has small difference in terms of stress because the
outputs of our neural network are the displacements and the gradient fields are then given
in the post-processing procedure based on the displacement field.
A(0.625, 1, 0.5)
F
B(0.625, 0, 0.5)
C
1
(a) AB line (b) CDEF plane
Figure 13: The displacement and stress results of DEM measured at the AB line compared to the reference
solution.
We observe the surface displacement and surface stress at the CDEF plane as sketched
in Figure 12b. Figure 14 shows the predicted displacement in terms of magnitude and the
24
predicted stress in terms of VonMises. Then we compare the norms of the DEM solution
and reference solution in terms of L2 norm and H 1 seminorm in two setups. Setup 1: We
use 8000 equidistant data points (N1 = 20, N2 = 20, N3 = 20) as the feeding data to train
our network; the network is trained for 50 steps. Setup 2: We use 64000 equidistant data
points (N1 = 40, N2 = 40, N3 = 40) as the feeding data to train our network; the neural
network is trained for 25 steps. The L2 norm and H 1 seminorm of the reference solution are
given by a Fenics program with a fine mesh as: ||u||FL2E = 0.13275 and ||u||FHE1 = 0.51407. As
shown in Table 2, the DEM obtain good results in terms of L2 norm and H 1 seminorm.
Figure 14: Displacement and VonMises stress at the CDEF plane in DEM of a twisted Neo-Hookean 3D
Cuboid.
DEMM
Data
||u||L2 ||u||H 1
20x20x20 (50 steps) 0.12921 0.49929
40x40x40 (25 steps) 0.13210 0.51001
Table 2: The L2 norm and H 1 seminorm of DEM in 2 setups. The corresponding norms of the reference
solution are given by Fenics program with a fine mesh as: ||u||F E FE
L2 = 0.13275, ||u||H 1 = 0.51407.
For the next three sections, we apply DEM on problems involving more than one field.
We start with the modeling of fracture using phase field approach.
25
associated with crack formation such as stress release are incorporated into the constitutive
model. A continuous scalar parameter (φ) is used to track the fracture pattern. The cracked
region is represented by φ = 1 while the undamaged portion is given by φ = 0. The phase
field approach aims to simultaneously solve for the displacement field and fracture region
by minimizing the total potential energy of system, as postulated by the Griffith theory for
brittle fracture [31].
In this section, physics informed neural networks are developed to solve the governing par-
tial differential equations for fracture analysis using DEM. The neural networks are trained
to approximate the displacement field and the damage fields which satisfy the governing
equations used to describe the physical phenomena. Modeling fracture using the phase field
method involves the solving for the vector-valued elastic field, u and the scalar-valued phase
field, φ. In DEM, the solution is obtained by minimization of the total energy of the system,
E [32]. The problem statement can be written as:
Minimize: E = Ee + Ec ,
(64)
subject to: u = u on ∂ΩD ,
where Ee is the stored elastic strain energy, Ec is the fracture energy and u is the prescribed
displacement on the Dirichlet boundary, ∂ΩD . In Eq. (64), Ee and Ec are defined as:
Z
Ee = g(φ)Ψ0 ()dΩ,
Ω Z
Gc
Z (65)
2 2 2
Ec = φ + l0 |∇φ| dΩ + g(φ)H(x, t)dΩ,
2l0 Ω
Ω
26
[35]. In particular, we set
2d(x,l) l0
BGc
2l0
(1 − l0
) d(x, l) 6 2
H(x, 0) = l0 , (69)
0 d(x, l) > 2
where B is a scalar parameter that controls the magnitude of the scalar history field and is
calculated as:
1
B= for φ < 1. (70)
1−φ
We shall now present the results obtained using the physics informed neural networks to
study phase field modeling of fracture.
We have used a network with 3 hidden layers of 50 neurons each and NCol = 8000 uniformly
spaced points within the domain. To find the solution of φ in the domain [0, 50], the trial
solution is defined as
x(x − 50) h i
φ(x) = − (x − 25)φ̂ + 1 , (74)
625
where φ̂ is given by the neural network. The solution of φ is chosen in such a way that it
satisfies all the boundary conditions as in [24, 25]. This ensures that the boundary conditions
are satisfied during the training of the network. In the collocation method, the mean squared
error of the function, f at the collocation points is minimized. The loss function, LCol for
the collocation method is defined as:
NCol
1 X
LCol = |f (xi )|2 . (75)
NCol i=1
For optimization, we use the Adam (adaptive momentum) optimizer followed by a quasi-
Newton method (L-BFGS). The boundary conditions are satisfied exactly as could be seen
27
1.00 comp
102 Adam
exact L-BFGS
0.75
101
Loss value
(x)
0.50
100
0.25
0.00 10 1
0 10 20 30 40 50 0 1000 2000
x Iteration
(a) Comparison of φexact and φcomp . (b) Convergence of the loss function.
Figure 15: One-dimensional phase field model using the collocation method.
in Fig. 15(a), where φcomp represents the value of φ obtained using the neural network and
φexact is obtained using Eq. (71). Fig. 15(b) shows the convergence of the loss function first
with the Adam optimizer and later using the L-BFGS optimizer. To measure the accuracy
of the method, the relative L2 error, Lrel
2 is computed using using the formula:
q
PNpred 2
i=1 (φcomp − φex ) dx
Lrel
2 = q . (76)
PNpred 2
i=1 φex dx
subject to the same boundary conditions as in Eq. (73). The form of the solution used for
DEM is as stated in Eq. (74). The loss function, LEner for DEM is defined as
NInt
L X
φ(xi ) − l02 |∇φ(xi )|2 .
LEner = (78)
NInt i=1
Fig. 16(a) presents the solution of φ obtained using DEM, while Fig. 15(b) shows the con-
vergence of the loss function. For DEM, Lrel
2 = 2.88%.
28
1.00 comp Adam
103
exact L-BFGS
0.75
102
Loss value
0.50
(x)
101
0.25
0.00 100
0 10 20 30 40 50 0 1000 2000
x Iteration
(a) Comparison of φexact and φcomp . (b) Convergence of the loss function.
Figure 16: One-dimensional phase field model using the energy method.
29
of the problem are shown in Fig. 17(a). In this example, we consider l0 = 0.0125 and
Gc = 2.7 × 10−3 kN/mm. The material properties of the plate are expressed in terms of
Láme’s constants, λ = 121.15 kN/mm2 and µ = 80.77 kN/mm2 . The domain is subdivided
into 3 sections along the y-axis; the crack zone, the bottom and the top of the crack. The
crack zone is between [(0.5 − 2l0 ), (0.5 + 2l0 )] on the y-axis. The training grid is shown in
Fig. 17(b). The strain history function is used to initialize the crack in the domain as in
Eq. (67). We have used a network with 3 hidden layers of 20 neurons each and Nx × Ny
uniformly spaced points for each section in the interior of the plate with Nx = 300 and
Ny = 81. The problem statement is defined as:
Z
Minimize: I (φ) = f (φ(x))dx,
Ω (79)
Gc
φ(x)2 + l02 |∇φ(x)|2 + g(φ(x))H(x, 0) .
where f (x) =
2l0
The loss function, L0 is defined as:
Atc = Abc = (0.5 − 2l0 ),
Ac = 4l0 ,
Nx ×Ny Nx ×Ny Nx ×Ny
! (80)
1 X X X
L0 = Atc f (xtc
i ) + Ac f (xci ) + Abc f (xbc
i ) ,
Nx × Ny i=1 i=1 i=1
where Atc , Abc , Ac denote the areas of section on top and bottom of the crack and the crack
zone, respectively. xtc c bc
i , xi and xi in Eq. (80) represent points in the three sub domains in
the interior of the plate. To compare the performance of DEM, we also initialize the crack
using the collocation method. The formulation of the collocation method is to minimize the
function f , defined by
Gc
f (x) = Gc l0 ∆φ(x) − φ(x) − g 0 (φ(x))H(x, 0) in Ω,
l0 (81)
0
and g (φ(x)) = −2(1 − φ(x))
subjected to homogeneous Neumann-type boundary conditions on the entire boundary de-
fined as:
∇φ · n = 0 on ∂Ω. (82)
We have used a network with 5 hidden layers of 20 neurons each and Nx × Ny uniformly
spaced points for each section in the interior of the plate and NBound uniformly spaced points
on each edge of the plate, with Nx = 300 Ny = 81 and NBound = 200. The loss function,
LCol for the collocation method is defined as:
Nx ×Ny Nx ×Ny Nx ×Ny
!
1 X X X
LCol = Atc |f (xtc 2
i )| + Ac |f (xci )|2 + Abc |f (xbc
i )|
2
+
Nx × Ny i=1 i=1 i=1
2 ! (83)
NX
Bound L
2 R
2 T
2 B
1 ∂φ(xi ) ∂φ(xi ) ∂φ(xi ) ∂φ(xi )
+ + + ,
NBound i=1 ∂x ∂x ∂y ∂y
30
where xLi , xR T B
i , xi , xi are the points on the left, right, top and bottom edges of the plate,
respectively. Fig. 18(a) and (b) show the initial crack in the plate using DEM and DCM, re-
spectively. Fig. 18(c) and (d) present the one-dimensional approximation plots and Fig. 18(e)
and (f) show the convergence of the loss function for both the methods.
In subsubsection 6.5.1, we observed that DEM performs better than DCM for the same
network parameters. From the results of the initilization of the crack, we can conclude that
DEM requires a smaller network to predict results accurately. Moreover, it is also seen that
DCM needs an explicit treatment of the Neumann boundary conditions even for traction-free
boundaries, which requires the use of more collocation points. The convergence of the loss
function is also slow, requiring 10 times more iterations. Hence, we use DEM to study the
growth of crack in a plate under tensile loading.
To observe the propagation of crack in the plate, we simultaneously solve the elastic field
and the phase field using a monolithic solver. The minimization problem reads as stated in
Eq. (65). The Dirichlet boundary conditions are:
u(0, y) = v(x, 0) = 0, v(x, 1) = ∆v, (84)
where u and v are the solutions of the elastic field in x and y-axis-axis. The computation is
performed by applying constant displacement increments of ∆v = 1 × 10−3 mm. The model
for the Dirichlet problem is:
u = [x(1 − x)]û,
(85)
v = [y(y − 1)]v̂ + y∆v,
where û and v̂ are obtained from the neural network. We have used a network with 5 hidden
layers of 50 neurons each. The loss function, LE is defined as:
LE = LElas + LP F ,
Nx ×Ny Nx ×Ny Nx ×Ny
!
1 X X X
LElas = Atc fe (xtc
i ) + Ac fe (xci ) + Abc fe (xbc
i ) ,
Nx × Ny i=1 i=1 i=1
Nx ×Ny Nx ×Ny Nx ×Ny
!
1 X X X (86)
LP F = Atc fc (xtc
i ) + Ac fc (xci ) + Abc fc (xbc
i ) ,
Nx × Ny i=1 i=1 i=1
fe (x) = g(φ(x))Ψ0 ((x)),
Gc
φ(x)2 + l02 |∇φ(x)|2 + g 0 (φ(x))H(x, t),
fc (x) =
2l0
where LElas and LP F define the losses in the elastic strain energy and the fracture energy,
respectively. The scatter plots of the deformed configuration and the corresponding phase
field plots for the simulation are shown in Fig. 19 and Fig. 20, respectively.
In the next section, we apply DEM on an electro-mechanically coupled systems.
6.6. Piezoelectricity
The electro-mechanical coupling is generated in some crystal materials where an electrical
polarization, P is induced due to the existence of the mechanical strain or the existence of
an electrical field can cause deformation of the geometry. The linear dependence between
the electric polarization and the associated mechanical strain refers to the piezoelectric effect
31
(a) Initial crack using DEM. (b) Initial crack using DCM.
1.00
comp
exact
0.75
0.50
(x)
0.25
0.00
0.0 0.5 1.0
x
(c) Comparison of φexact and φcomp us- (d) Comparison of φexact and φcomp us-
ing DEM. ing DCM.
Loss value
101 102
100
10 2
Figure 18: Initialization of crack in the plate using the strain-history function.
32
(a) (b) (c)
(d) (e)
Figure 19: Scatter plots of the deformed configuration for prescribed displacement of (a) 1 × 10−3 , (b)
2 × 10−3 , (c) 3 × 10−3 , (d) 4 × 10−3 and (e) 5 × 10−3 .
(d) (e)
Figure 20: Crack pattern for prescribed displacement of (a) 1 × 10−3 , (b) 2 × 10−3 , (c) 3 × 10−3 , (d) 4 × 10−3
and (e) 5 × 10−3 .
33
which is common in non-centrosymmetric crystals. The direct and the indirect piezoelectric
effect can be expressed in mathematical form of as
P = p : ,
σ = p · E, (87)
where p refers to the third order piezoelectric tensor, and σ are the second order strain and
stress tensors, and E is the electric field vector with Ei = −θ,i and θ is the electric potential.
It can be concluded that the piezoelectricity is a cross coupling between the elastic and the
dielectric variables Hence, the constitutive equations to identify the coupling between the
mechanical stress and the strain with the related electric field and displacement are given by
D = e : + κ · E,
σ = C : − e · E, (88)
where D refers to the electric displacement that causes the electrical polarization, while
e, κ, C are the piezoelectric, the dielectric, and the elastic tensors, respectively.
Energy minimization is adopted in the DEM to obtain the solution of the field variables.
The problem statement reads as
Minimize: E = Ei + Wext ,
subject to: u = u on ∂ΩDu ,
(89)
and θ = θ on ∂ΩDθ ,
and σ · n = tN on ∂ΩN ,
where E represents the total energy of the system, Ei is the bulk internal energy, whose
density is Ψ, and Wext stands for the work of external forces. The internal energy density,
Ψ, is expressed as
1 1
Ψ = : C : − E · e · E − E · κ · E. (90)
2 2
We shall now present the example of a cantilever beam to show the application of DEm to
solve PDEs involving electromechanical coupling.
34
Electrode Applied force
V F
Figure 21: Schematic diagram showing the electrical and the mechanical boundary conditions for the can-
tilever nanobeam under study.
Displacement in x
10 Displacement in x x 10
−4
0.5 10
5
5 0 5 0
−5
0 −0.5 0
0 50 100 0 20 40 60 80 100
Displacement in y
10 Displacement in y x 10
−3
−2 10 0
5 −4
−6 5 −5
0 −8
0 50 100 0
0 20 40 60 80 100
5 −0.1 5 −50
−0.2 0 −100
0 0 20 40 60 80 100
0 20 40 60 80 100
(b)
(a)
Figure 22: The predicted values of three outputs of the system using DEM: (a) applying mechanical loading,
and (b) applying electrical loading.
In the case of pure mechanical loading, a point load is applied as shown in Fig. 21. Due to
the electro-mechanical coupling, an electrical potential, θ is induced by the mechanical load.
Assuming closed circuit configuration, θ is fixed to zero on the bottom surface while the
electrode placed on the top surface undergoes a difference in electric potential as a result of
the deformation. Fig. 22b shows the plots of the deformed shape and the electric potential
obtained using the DEM approach.
In the next step, we consider pure electrical loading. In this case, the electric potential
on the top surface is fixed to V = −100 MV. The defomed configuration is presented in
Fig. 22b. Both the cases considered in this example, demonstrate the ability of the DEM to
solve electro-mechanically coupled system without the aid of the classical numerical methods.
All the examples discussed previously involved second-order PDEs. To show the robustness
of DEM, we take up the study of a fourth-order PDE in the next section.
35
continuity poses significant challenges. This, however, can be easily solved by the pro-
posed deep collocation and the deep energy methods with a set of DNNs approximating the
transversal deflection, which are proven to be effective in the bending analysis of Kirchhoff
plate of various geometries and even with cut-outs.
The governing equation of Kirchhoff bending problem, expressed in terms of transversal
deflection, is
p
52 52 w = 54 w = ,
(91)
D
∂4 ∂ 4 ∂4
where 54 (·) = ∂x 4 +2 ∂x2 ∂y 2 + ∂y 4 is commonly referred to as biharmonic operator. Similar to
the previous examples, the problem statement in strong form is equivalent the minimization
of the total potential energy of the system, E, where
ZZ Z Z
1 T ∂w
E= k M − qw dΩ − V̄n wdS + M̄n dS (92)
Ω 2 S3 S2 +S3 ∂n
under uniformly distributed load. The problem is stated as:
w = argmin E (v) (93)
¨
v ∈H(D)
Additionally, the boundary conditions of the Kirchhoff plate taken into consideration can be
classified into three types; simply-supported, clamped and free boundary conditions and is
expressed as
∂Ω = Γ1 + Γ2 + Γ3 . (94)
To solve Kirchhoff bending problems with boundary constraints, the physical domain is
first discretized with randomly distributed collocation points denoted by x Ω = (x1 , . . . , xNΩ )T .
Another set of collocation points are added to discretize the boundaries denoted by x Γ (x1 , . . . , xNΓ )T .
The transversal deflection, w can thus be approximated with the aforementioned deep feed-
forward neural network wh (x; p). The loss function is constructed to find the approximate
solution by minimizing either the mean squared error of the governing equation in residual
form or the total energy, along with boundary conditions approximated by wh (x; p).
Substituting wh (x Ω ; p) into either Equation 91 or into 92, a physical informed deep neural
network is obtained: G (x Ω ; p) for the strong form of the problem, and Ep (x Ω ; p) the energy
approach, respectively.
The boundary conditions can also be expressed by the physical informed neural network
approximation wh (x Γ ; p):
On clamped boundaries Γ1 , we have
∂wh (x Γ1 ; p)
wh (x Γ1 ; p) = w̃, = p̃n . (95)
∂n
On simply-supported boundaries Γ2 ,
wh (x Γ2 ; p) = w̃, M̃n (x Γ2 ; p) = M̃n , (96)
On free boundaries Γ3 ,
∂Mns (x Γ3 ; p)
Mn (x Γ3 ; p) = M̃n , + Qn (x Γ3 ; p) = q̃, (97)
∂s
36
It should be noted that n, s here refer to the normal and tangent directions along each
boundary.
It is clear that all induced physical informed neural network in Kirchhoff bending analysis
G (x; p), E (x Ω ; p), Mn (x; p), Mns (x; p), Qn (x; p) share the same parameters as wh (x; p).
Considering the generated collocation points in domain and on boundaries, they can all be
learned by minimizing the mean square error loss function:
L (p) = M SE = M SEG + M SEΓ1 + M SEΓ2 + M SEΓ3 , (98)
with
Nd NΩ
1 X
2 1 X
4 h
5 w (x Ω ; p) − p
2
M SEG =
G (x Ω ; p) =
D
,
Nd i=1 NΩ i=1
NΓ NΓ1
1 X1
h
2 1 X
∂wh (x Γ1 ;p)
2
M SEΓ1 = w (x Γ1 ; p) − w̃ +
,
NΓ1 i=1
NΓ1 i=1
∂n
− p̃ n
NΓ NΓ2
1 X2
h
2
w (x Γ2 ; p) − w̃
+ 1 X
M̃n (x Γ ; p) − M̃n
2 ,
M SEΓ2 = 2
NΓ2 i=1 NΓ2 i=1
NΓ NΓ
1 X3
2 1 X3
∂Mns (x Γ3 ;p)
2
M SEΓ3 = M̃n (x Γ3 ; p) − M̃n +
,
NΓ3 i=1
NΓ3 i=1
∂s
+ Qn (x Γ 3 ; p) − q̃
(99)
N K h
where x Ω ∈ R , p ∈ R are the neural network parameters. L (p) = 0, w (x; p) is then
a solution to transversal deflection. Our goal is to find a set of parameters p such that the
approximated deflection wh (x; p) minimizes the loss L (p). If L (p) is a very small value,
the approximation wh (x; p) is very closely satisfying governing equations and boundary
conditions, namely
p = argmin L (p̂) (100)
p̂∈RK
However, in order to solve the problem of a thin plate with the deep energy method, it is
necessary to construct the loss function in a different way:
37
2 Q H K L G G H Q O D \ H U
7 Z R K L G G H Q O D \ H U V
7 K U H H K L G G H Q O D \ H U V
5 H O D W L Y H H U U R U R I W U D Q V Y H U V D O G H I O H F W L R Q (x10 −2 )
1 H X U R Q V S H U K L G G H Q O D \ H U
(a) Collocation points in the square do- (b) The relative error of deflection with varying hidden
main. layers and neurons.
Figure 23
For this numerical example, the arrangement of collocation points is depicted in Fig-
ure 23(a). Neural networks with different neurons and depth are applied in the calculation
to study the feasibility of deep collocation in solving Kirchhoff plate bending problems. Table
1 lists the maximum deflection at the central point with varying hidden layers and neurons.
It is clear that the results predicted by more hidden layers are more desirable and as hidden
layer and neuron number grows, the maximum deflection becomes more accurate approach-
ing the analytical serial solution for even two hidden layers. The relative error shown in
Figure 23(b) better depicts the advantages of deep neural network than shallow wide neural
network. More hidden layers, with more neurons yield flatting of the relative error. Vari-
ous contour plots are shown in Figure 24 and compared with the analytical solution. It is
38
Table 1: Maximum deflection predicted by deep collocation method.
obvious that the deep collocation method predict the deflection of the whole plate in good
accordance with the analytical solution.
39
wpred wex wpred
(a) (b)
wexact
wpred
x
[
\ y
(c) (d)
Figure 24: (a) Predicted deflection contour (b) Deflection error contour (c) Predicted deflection (d) Analytical
deflection of the simply-supported plate on Winkler foundation with 3 hidden layers and 50 neurons.
2 Q H K L G G H Q O D \ H U
7 Z R K L G G H Q O D \ H U V
7 K U H H K L G G H Q O D \ H U V
5 H O D W L Y H H U U R U R I W U D Q V Y H U V D O G H I O H F W L R Q (x10 −4 )
1 H X U R Q V S H U K L G G H Q O D \ H U
(a) Collocation points in the circular do- (b) The relative error of deflection with varying hidden
main layers and neurons
Figure 25
40
Table 2: Maximum deflection predicted by deep collocation method.
Predicted Maximum
Clamped Circular Plate Exact solution
Deflection
1 hidden layers, 30 neurons 15.5958
1 hidden layers, 40 neurons 15.5685
1 hidden layers, 50 neurons 15.6201
2 hidden layers, 30 neurons 15.6251
2 hidden layers, 40 neurons 15.6264 15.6250
2 hidden layers, 50 neurons 15.6224
3 hidden layers, 30 neurons 15.6269
3 hidden layers, 40 neurons 15.6247
3 hidden layers, 50 neurons 15.6229
(a) (b)
wexact
wpred
[ x
\ y
(c) (d)
Figure 26: (a) Predicted deflection contour (b) Deflection error contour (c) Predicted deflection (d) Exact
deflection of the clamped circular plate with 3 hidden layers and 50 neurons
41
where a,b are the length of the [Link] analytical solution for this problem is given by
w = 4 p10 1 2 sin πx sin πy
a b
. (106)
π D( 2 + 2 )
a b
For the detailed configuration of this numerical example, a 31x31 random distributed
collocation point set shown in Figure 23 is first generated in the plate domain. Based on this,
the Monte Carlo integration scheme is adopted in the calculation of total potential energy.
For this problem with sinusoidal load, the activation function was tailored accordingly as
f (x) = sin πx 2
, which improves the accuracy of the results. A neural network with 50
neurons each hidden layer is deployed to study the relative error of maximum deflection at
central points and the whole deflection by the proposed activation function and classical
Tanh activation function. And for the DNN, three hidden layers is deployed. As for DNN
with an autoencoder, two encoding layer configurations [2, 1] ∗ H and [3] ∗ H are considered,
with H the number of neurons on each encoding layer. It is shown clearly in Figure 27,
Figure 28 that the numericals gained by proposed activation function outweigh classical
Tanh activation function for both two neural network schemes.
Moreover, we carry on studying an accurate and efficient configuration of the proposed
DNN with an autoencoder, in hope for a better discovery of its physical patterns. For
this numerical example, ten encoding layer types are studied and shown in Figure 29. We
begin by studying the influence of different encoding layer with different neurons per layer
on the accuracy and efficiency of this problem. We can see that with more encoding layer
and neurons per layer, it will obtain a more stable and accurate results. Moreover, the
corresponding computational cost of those ten schemes are listed in Figure 30. It is clear
that a two encoding layer autoencoder can meet both accuracy and efficiency ends. And the
neurons on each scheme increase, it is clearly that the numerical solution converges to the
analytical solution.
Finally, the deep energy scheme is then compared with an open source IGA-FEM code
from on this problem [37], which is also based on the kirchhoff plate theory. The accuracy and
computational time of both scheme are compared. For DEM, a single encoding layer with 30
neurons is applied with DNN including increasing collocation points. And the computational
time for DEM means the training time. It is clear that IGA-FEM is clearly faster and more
accurate. But, the performance of DEM is still in the acceptance range. Once the DNN
with an autoencoder is trained, it can be very fast used to predict deflection and stress in
the whole plate.
For deep energy method configuration, an auto-encoder is added to the deep neural net-
works to better capture the physical patterns of this problem. We also studied the optimal
configuration of this auto encoder with different layers and neuron numbers, which will be
beneficial to the further application of this deep energy method.
To better reflect the deflection vector in the whole physical domain, the contour plot,
contour error plot of deflection predicted by two encoding layers with [150,100] neurons are
shown in Figure 32, which agree well with deflection obtained from analytical solution.
42
Tanh(x) Tanh(x)
sin(πx2 ) sin(πx2 )
5 H O D W L Y H H U U R U R I P D [ P L X P G H I O H F W L R Q (x10 −5 )
5 H O D W L Y H H U U R U R I Z K R O H G H I O H F W L R Q (x10 −3 )
+ L G G H Q O D \ H U + L G G H Q O D \ H U
(a) (b)
Figure 27: Relative error of (a) maximum deflection and (b) whole deflection predicted by Tanh and proposed
activation function of a DNN for the simply-supported plate
5 H O D W L Y H H U U R U R I Z K R O H G H I O H F W L R Q (x10 −3 )
+ L G G H Q O D \ H U + L G G H Q O D \ H U
(a) (b)
Figure 28: Relative error of (a) maximum deflection and (b) whole deflection predicted by Tanh and proposed
activation function of a DNN with an autoencoder for the simply-supported plate
5 H O D W L Y H H U U R U R I P D [ L P X P W U D Q V Y H U V D O G H I O H F W L R Q (x10 −5 )
5 H O D W L Y H H U U R U R I W U D Q V Y H U V D O G H I O H F W L R Q (x10−3)
1 H X U R Q V S H U K L G G H Q O 7 Z R H Q F R G L
2 Q H H Q F R G L Q J O D \ H U Z L W K [1]
K L G G H Q O D \ H U Q H X U R Q V D \ H U Q J O D \ H U Z L W K [3 2]
K L G G H Q O D \ H U Q H X U R Q V 1 H X U R Q V S H U K L G G H Q O 7 Z R H Q F R G L
2 Q H H Q F R G L Q J O D \ H U Z L W K [1]
K L G G H Q O D \ H U Q H X U R Q V D \ H U Q J O D \ H U Z L W K [3 2]
K L G G H Q O D \ H U Q H X U R Q V
2 Q H H Q F R G L Q J O D \ H U Z L W K [2]
K L G G H Q O D \ H U Q H X U R Q V 7 K U H H H Q F R G L Q J O D \ H U Z L W K [3 1 1]
K L G G H Q O D \ H U Q H X U R Q V 2 Q H H Q F R G L Q J O D \ H U Z L W K [2]
K L G G H Q O D \ H U Q H X U R Q V 7 K U H H H Q F R G L Q J O D \ H U Z L W K [3 1 1]
K L G G H Q O D \ H U Q H X U R Q V
2 Q H H Q F R G L Q J O D \ H U Z L W K [3]
K L G G H Q O D \ H U Q H X U R Q V 7 K U H H H Q F R G L Q J O D \ H U Z L W K [3 2 1]
K L G G H Q O D \ H U Q H X U R Q V 2 Q H H Q F R G L Q J O D \ H U Z L W K [3]
K L G G H Q O D \ H U Q H X U R Q V 7 K U H H H Q F R G L Q J O D \ H U Z L W K [3 2 1]
K L G G H Q O D \ H U Q H X U R Q V
7 Z R H Q F R G L Q J O D \ H U Z L W K [2 1]
K L G G H Q O D \ H U Q H X U R Q V 7 K U H H H Q F R G L Q J O D \ H U Z L W K [3 2 2]
K L G G H Q O D \ H U Q H X U R Q V 7 Z R H Q F R G L Q J O D \ H U Z L W K [2 1]
K L G G H Q O D \ H U Q H X U R Q V 7 K U H H H Q F R G L Q J O D \ H U Z L W K [3 2 2]
K L G G H Q O D \ H U Q H X U R Q V
7 Z R H Q F R G L Q J O D \ H U Z L W K [3 1]
K L G G H Q O D \ H U Q H X U R Q V 7 K U H H H Q F R G L Q J O D \ H U Z L W K [3 3 2]
K L G G H Q O D \ H U Q H X U R Q V 7 Z R H Q F R G L Q J O D \ H U Z L W K [3 1]
K L G G H Q O D \ H U Q H X U R Q V 7 K U H H H Q F R G L Q J O D \ H U Z L W K [3 3 2]
K L G G H Q O D \ H U Q H X U R Q V
(a) (b)
Figure 29: Relative error of (a) maximum deflection and (b) whole deflection predicted by different encoding
layer configurations of a DNN with an autoencoder for the simply-supported plate.
43
+ L G G H Q O D \ H U Q H X U R Q V
+ L G G H Q O D \ H U Q H X U R Q V
+ L G G H Q O D \ H U Q H X U R Q V
&