0% found this document useful (0 votes)
38 views18 pages

Exact Dirichlet Boundary Physics-Informed Neural Network EPINN

Uploaded by

weixinj36
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views18 pages

Exact Dirichlet Boundary Physics-Informed Neural Network EPINN

Uploaded by

weixinj36
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Available online at www.sciencedirect.

com
ScienceDirect

Comput. Methods Appl. Mech. Engrg. 414 (2023) 116184


www.elsevier.com/locate/cma

Exact Dirichlet boundary Physics-informed Neural Network EPINN


for solid mechanics
Jiaji Wanga , Y.L. Mob , Bassam Izzuddinc , Chul-Woo Kimd ,∗
a
Department of Civil Engineering, The University of Hong Kong, Hong Kong, China
b Department of Civil and Environmental Engineering, University of Houston, US
c Department of Civil and Environmental Engineering, Imperial College London, London, UK
d Department of Civil and Earth Resources Engineering, Kyoto University, Japan

Received 11 November 2022; received in revised form 8 June 2023; accepted 8 June 2023
Available online 30 June 2023

Abstract
Physics-informed neural networks (PINNs) have been rapidly developed for solving partial differential equations. The Exact
Dirichlet boundary condition Physics-informed Neural Network (EPINN) is proposed to achieve efficient simulation of solid
mechanics problems based on the principle of least work with notably reduced training time. There are five major building
features in the EPINN framework. First, for the 1D solid mechanics problem, the neural networks are formulated to exactly
replicate the shape function of linear or quadratic truss elements. Second, for 2D and 3D problems, the tensor decomposition
was adopted to build the solution field without the need of generating the finite element mesh of complicated structures to
reduce the number of trainable weights in the PINN framework. Third, the principle of least work was adopted to formulate the
loss function. Fourth, the exact Dirichlet boundary condition (i.e., displacement boundary condition) was implemented. Finally,
the meshless finite difference (MFD) was adopted to calculate gradient information efficiently. By minimizing the total energy
of the system, the loss function is selected to be the same as the total work of the system, which is the total strain energy
minus the external work done on the Neumann boundary conditions (i.e., force boundary conditions). The exact Dirichlet
boundary condition was implemented as a hard constraint compared to the soft constraint (i.e., added as additional terms in the
loss function), which exactly meets the requirement of the principle of least work. The EPINN framework is implemented in
the Nvidia Modulus platform and GPU-based supercomputer and has achieved notably reduced training time compared to the
conventional PINN framework for solid mechanics problems. Typical numerical examples are presented. The convergence of
EPINN is reported and the training time of EPINN is compared to conventional PINN architecture and finite element solvers.
Compared to conventional PINN architecture, EPINN achieved a speedup of more than 13 times for 1D problems and more than
126 times for 3D problems. The simulation results show that EPINN can even reach the convergence speed of finite element
software. In addition, the prospective implementations of the proposed EPINN framework in solid mechanics are proposed,
including nonlinear time-dependent simulation and super-resolution network.
© 2023 Elsevier B.V. All rights reserved.

Keywords: Physics-informed Neural Network (PINN); Exact Dirichlet boundary PINN (EPINN); Principle of least work; Finite element; Tensor
decomposition; Meshless finite difference (MFD)

∗ Corresponding author.
E-mail addresses: [email protected] (J. Wang), [email protected] (Y.L. Mo), [email protected] (B. Izzuddin),
[email protected] (C.-W. Kim).

https://doi.org/10.1016/j.cma.2023.116184
0045-7825/© 2023 Elsevier B.V. All rights reserved.
J. Wang, Y.L. Mo, B. Izzuddin et al. Computer Methods in Applied Mechanics and Engineering 414 (2023) 116184

1. Introduction
Over the past 50 years, there has been substantial development in simulating solid mechanics problems by
solving the governing equations of partial differential equations (PDEs) using the finite element (FE) method [1].
Although significant progress has been achieved in the FE method for the forward problems of simulation in
solid mechanics problems, the existing FE method may still face several challenges in solving inverse problems
(i.e., model updating for material parameters [2]) or design optimization problems (i.e., optimum design of
engineering structures) because of notable computational costs. Engineers and researchers may adopt FE models of
varying levels of sophistication for mechanical performance assessment and design of structures, which may pose
prohibitive procedural and time demands in computational structural optimization. In FE simulation, the gradient
information of output fields with respect to design parameters is hard, if not impossible, to obtain. Therefore, any
optimization of complicated structures according to the traditional approach would typically adopt gradient-free
algorithms (e.g., genetic algorithm), which require thousands of FE simulations posing much greater computational
demands compared to gradient-based algorithms (e.g., gradient descent algorithm). The optimization of structures
may include hundreds of design parameters and gradient-free optimization using FE software can be infeasible in
practice. In addition, traditional FE software mostly relies on Central Processing Units (CPUs), while the utilization
of Graphical Processing Units (GPUs) is seldom considered in conventional FE software. A CPU contains a few
cores with substantial cache memory to complete fewer computational threads in parallel. In contrast, a GPU is
composed of thousands of cores to complete thousands of threads in parallel. Compared to CPUs, GPUs are designed
to subdivide complex problems into thousands of separate tasks and compute them in parallel, making them ideal
for machine learning (ML) tasks. In general, the FE models may face challenges for inverse problems, optimization
problems, and full utilization of GPUs and GPU-based supercomputers.
With the rapid development of artificial intelligence (AI) and GPU-based supercomputers, deep learning using
deep neural networks (DNNs) has achieved success in many research fields, including but not limited to computer
vision [3], natural language processing, self-driving cars, biological science [4], generative modeling [5], and
recommendation systems. DNNs with trillions of trainable weights can be trained on state-of-the-art GPU-based
supercomputers [5] and the infusing between AI and computational solid mechanics has been a heated research topic.
Fig. 1 shows the schematic plot of AI models from model-driven to data-driven algorithms for solid mechanics. As
shown in Fig. 1, the FE approach is model-driven and meets the governing equations. In comparison, conventional
deep-learning methods including convolutional neural networks (CNN) [6], recurrent neural networks (RNN) [7],
and Transformers [8] are data-driven. Physics-informed Neural Networks (PINNs) [9,10] were proposed to solve
forward problems and inverse problems of physics systems governed by partial differential equations (PDEs)
by training neural networks on GPUs, which is similar to the FE model approach. In addition, Fourier Neural
Operators (FNO) were also proposed to learn the mapping from the input field to the solution field based on
big data generated by FE simulation with zero-shot super-resolution performance [11]. The development of PINN
architecture is briefly reviewed below while a comprehensive literature review can be referred to Karniadakis
et al. [1]. Based on the universal approximation theorem [12], E and Yu [13] proposed the Deep Ritz method
for solving PDEs, including a fully-connected deep neural network with residual connections to serve as the trial
function, and neural networks to obtain the total energy of the system based on trial functions. The Deep Ritz
method showed converged results for the Poisson equation in two dimensions and high dimension and transfer
learning was also proposed and implemented. Raissi et al. [9] proposed PINNs to replicate the shape function and
minimize the error of force equilibrium equations to simulate solid mechanics problems, including both forward and
inverse problems. The PINNs proposed by Raissi et al. [9] include both continuous-time models and discrete-time
models, and the objective function is to minimize the total error of force equilibrium equations. Haghighat et al. [14]
proposed the PINN framework for surrogate modeling of solid mechanics including both a linear elasticity problem
and a von Mises elastoplastic problem. The fully connected neural networks were used to input coordinates and
predict the displacement field and stress field, and the loss function is formulated as the total error of governing
equations of solid mechanics (i.e. strain compatibility equation, equilibrium equation, constitutive model, Dirichlet
boundary conditions, and Neumann boundary conditions). The challenges of conventional PINN in simulation of
solid mechanics problems may be categorized into (1) difficulty in exact imposition of boundary conditions; (2)
lack of efficient architecture of neural networks to fit solid mechanics problems; (3) difficulty in optimum definition
of loss function; (4) automated differentiation for gradient back-propagation may be inaccurate. Because of these
2
J. Wang, Y.L. Mo, B. Izzuddin et al. Computer Methods in Applied Mechanics and Engineering 414 (2023) 116184

Fig. 1. From model-driven to data-driven algorithms for solid mechanics.


Note: EPINN is proposed and implemented in this study, while SPINN is proposed in Section 4.2.

challenges, current PINN architectures are mostly used when there are labeled data obtained from experimental
study or FE analysis, while the successful training of PINN without labeled data are still very challenging.
For the boundary condition of PINN, Rao et al. [15] summarized that the conventional PINN framework
considered residual loss components as soft constraints with Lagrange multipliers, which may not exactly meet the
boundary conditions and may notably reduce convergence speed. In response to this issue, Rao et al. [15] proposed
a PINN architecture to exactly imposing boundary conditions. The “hard” boundary condition enforcement was
achieved by training and infusing three single neural networks: the boundary condition network, the distance function
network, and the general solution network. Sukumar and Srivastava [16] also proposed a novel approach to exactly
imposing boundary conditions in PINN architecture based on constructive solid geometry through approximate
distance function (ADF). Based on ADF, a reasonable trial function can be obtained for the Dirichlet boundary
condition and the Neumann boundary condition. However, the imposition of exact Neumann boundary conditions
in solid mechanics PINN requires the neural network to predict displacement and stress simultaneously, thereby in-
creasing the number of trainable weights in PINN architecture, and is inconsistent with the FE method. Furthermore,
this is difficult to extend to nonlinear solid mechanics problems. In the EPINN architecture proposed in this study,
the Dirichlet boundary condition is satisfied exactly by multiplying the displacement trial function with ADF, while
the Neumann boundary condition is achieved by optimizing the loss function through the principle of least work.
For the architectures of PINN, the research community proposed many network architectures [17] to improve
the performance of PINN, including but not limited to the Fourier Network [18], Modified Fourier Network [19],
Highway Fourier Network [20], Multi-scale Fourier Feature Network [21], Spatial–temporal Fourier Feature
Network [21], Sinusoidal Representation Networks [22], Deep Galerkin Method (DGM) architecture [23] and
Multiplicative Filter Network [24], which have been implemented in Nvidia Modulus platform [25]. In these
architectures, it may be hard to define the concept equivalent to the mesh size in the FE method, and the
number of trainable weights may be notably higher than the Degree of freedom (DOF) in FE methods upon
convergence. In addition, because the loss function of PINN is typically a summation of the error of various
governing equations with various units, it may be challenging to balance each term in the loss function. Various
normalization methods to balance the gradient of each loss term was proposed, including SoftAdapt [26], Relative
Loss Balancing with Random Lookback (ReLoBRaLo) [27], and GradNorm [28]. However, these methods cannot
guarantee the convergence of solid mechanics problems and the physical meaning of these methods in PINN are
not clear. Recently, the infusing between PINN architecture and the FE method was proposed to achieve efficient
architecture and loss function which are consistent with the FE method. Saha et al. [29] and Zhang et al. [30]
proposed Hierarchical Deep Learning Neural Network (HiDeNN), where the weights and bias of deep neural
networks are implemented based on spatial discretization and element mesh of the FE approach. As proved by Saha
et al. [29] and Zhang et al. [30], HiDeNN can exactly replicate the spatial discretization of the FE method. HiDeNN
3
J. Wang, Y.L. Mo, B. Izzuddin et al. Computer Methods in Applied Mechanics and Engineering 414 (2023) 116184

was developed to achieve the construction of DNNs in the same manner as the FE software, which takes in the nodal
coordinates as input and produces a shape function in the form of DNN, whose weights are exactly derived from
nodal positions. The number of trainable parameters can be reduced to the same as the number of DOFs in traditional
FE software at the same mesh size. Recently, a reduced-order Hierarchical Deep learning Neural Network based on
Tensor Decomposition (HiDeNN-TD) was also proposed [31], which infuses the HiDeNN with TD methods and
achieved convergence for solid mechanics problems with high accuracy and notably lower trainable weights. The
TD method and principle of least work are adopted in the EPINN framework. In PINN architecture, the gradient
of the solution field with respect to coordinates and time is mostly obtained using back-propagation and automated
differentiation of neural networks. However, the automated differentiation may obtain notably high gradient results at
stress localization regions, which may reduce the speed of convergence for solid mechanics problems. In this study,
the meshless finite difference (MFD) is adopted in EPINN framework for efficient simulation of solid mechanics
problems. In general, EPINN is proposed in this study to solve solid mechanics problems even without the need of
additional labeled data of the solution field from FE simulation or experimental study.
In this study, the following technical development in this study is summarized:
(1) First proved the shape function of truss element can be exactly reformulated by Convolutional neural
networks.
(2) Adopted tensor decomposition to further reduce the number of trainable weights in EPINN to be even lower
than the DOF of finite element method at the same mesh size, thereby notably reducing the training cost and
improving the convergence.
(3) Developed efficient loss function following the total action of the mechanical system, which is explainable
and avoids the problem in conventional PINN methods, there is no need to balance the loss terms of PDE loss,
Dirichlet boundary loss, and Neumann boundary loss.
(4) In the EPINN framework, there is no need to meet the Neumann boundary condition exactly, thereby notably
reducing the difficulty in training PINNs. The Dirichlet boundary condition can be exactly met, which may avoid
the influence of error from the Dirichlet boundary to influence the training process.

2. Model architecture of EPINN


In this section, the model architecture of EPINN is illustrated in detail with a comparison to existing PINN
models. Fig. 2 shows the schematic plot of EPINN. As shown in Fig. 2, EPINN formulates neural networks in
each direction (x, y, z) to exactly replicate the shape function of linear or quadratic truss element (Section 2.1) and
use tensor decomposition (TD) to construct the 2D or 3D displacement field. For the forward problem, all trainable
parameters in the EPINN framework are nodal displacements in this spatial discretization layer (denoted in blue
color in Fig. 2). TD can build the solution field without the need of generating mesh conforming to the complicated
structure (Section 2.2). The exact displacement boundary condition was implemented (Section 2.4) based on ADF
(approximate distance function). The meshless finite difference (MFD) was adopted to calculate the gradient of
the displacement field with respect to coordinates (i.e. updating strain field) efficiently (Section 2.5). For the linear
elastic problem or hyperelastic problem, the stress field can be obtained from strain field directly. For nonlinear
problems, pretrained deep learning constitutive models can be adopted to achieve stress updating, including but not
limited to Temporal Convolutional Network (TCN) [32], Long Short-Term Memory (LSTM), Gated Recurrent Unit
(GRU) and Sequence to Sequence models (Seq2Seq) [33]. MFD was used to formulate the strain solution layer and
calculate the total strain energy of the system. By minimizing the total energy of the system, the solution can be
achieved. All the PINN models reported and compared in this study are implemented using the Nvidia Modulus
platform (version 22.07) [25] using Intel Xeon Platinum 8368 CPU with Nvidia A100 40 GB SXM GPU accelerators
provided by Osaka University SQUID (Supercomputer for Quest to Unsolved Interdisciplinary Datascience). The
Nvidia Modulus platform is an open-source platform developed based on PyTorch and Sympy and can be used for
Nvidia GPU accelerators. Single precision was used for the training process of all PINN models in this study.

2.1. Reformulating shape function of 1D truss element with neural networks

For the 1D problem of solid mechanics, the shape function of truss element (linear or quadratic) can be exactly
reformulated by a few layers of neural networks, where the weights and biases can be calculated based on node
coordinates of 1D truss elements as proved by Zhang et al. [30]. EPINN framework adopted this method for
4
J. Wang, Y.L. Mo, B. Izzuddin et al. Computer Methods in Applied Mechanics and Engineering 414 (2023) 116184

Fig. 2. Flowchart of EPINN framework for solving static solid mechanics problems based on the principle of least work.
(Note: TD denotes tensor decomposition, ADF denotes approximate distance function, MFD denotes meshless finite difference). (For
interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

constructing the shape function in the 1D case with convolutional architecture (which is slightly different from
the original architecture proposed by Zhang et al. [30]), while tensor decomposition can be used to construct the
solution field of displacement in 2D and 3D solid mechanics problems. Fig. 3 shows the schematic plot of the shape
function of the truss element reformulated by neural networks. As shown in Fig. 3(a), the shape function of linear
1D truss element is formulated as follows:
⎧ x−x I −1
, x I −1 ≤ x ≤ x I
⎨ x I −x I −1


x I +1 −x
N I (x) = x −x , x I ≤ x ≤ x I +1 (1)
⎪ I +1 I

0, elsewhere

N I (x; x I −1 , x I , x I +1 )
( ) ( ) (2)
= Relu x I −x −1
Relu (−x + x I ) + 1 + Relu x −1
Relu (x − x I ) + 1 − 1,
I −1 I +1 −x I

u I (x) = N I (x; x I −1 , x I , x I +1 ) u I
( ( ) ) ( ( ) ) (3)
−1
= Relu x I −x Relu (−x + x I ) + 1 − 0.5 u I + Relu x −1
Relu (x − x I ) + 1 − 0.5 u I
I −1 I +1 −x I

where uI denotes the nodal solution at node number I, NI (x) denotes the shape function at node number I, and xI−1 ,
xI, and xI+1 are node coordinates. Relu denotes Rectified Linear Unit (ReLU) activation function. When the node
coordinates are fixed, the only trainable weights are the nodal displacement uI .
Fig. 3 shows the assembly of the shape function in the 1D truss element. Zhang et al. [30] adopted MLP (multi-
layer perception) in the HiDeNN model. In this study, it is found that the shape function of the 1D truss element can
be exactly reformulated by convolutional neural networks (CNNs). Fig. 3(a) shows the neural network architecture
to exactly reformulate the shape function of linear truss elements adopting convolutional architecture. As shown in
Fig. 3(a), the input layer is the x coordinate at an arbitrary point in the solution domain and it is fed into a fully
connected layer and two subsequent Conv1D (1D convolutional) layers. Subsequently, Layer No. 4 is a 1D sum
pooling layer with a stride of 2, kernel size of 2, and padding of 1, which will obtain shape function at various
nodes based on Eq. (2). After that, a Conv1D layer is used to obtain the displacement field inside each element
as per Eq. (3). Finally, the element displacements are assembled to obtain the total displacement field of the truss
structure. In this architecture, the only trainable parameter is the solution of displacement fields at nodes in Layer
No. 5 in Fig. 3(a), while all other layers have fixed weights and biases when the coordinates of mesh nodes are
fixed. Therefore, the number of trainable parameters in EPINN can be notably reduced to the mesh number in
the 1D case compared to conventional PINN models [17–24]. Fig. 3(b) shows the neural network architecture to
5
J. Wang, Y.L. Mo, B. Izzuddin et al. Computer Methods in Applied Mechanics and Engineering 414 (2023) 116184

Fig. 3. Shape function of truss element reformulated by neural networks.


(Note: orange font denotes fixed weights, blue font denotes fixed bias, red font denotes trainable weights, Relu denotes Rectified Linear
Unit, blue box denotes neural networks to reformulate shape function inside the single element, linear denotes no activation function is used
in this layer, Conv1D denotes one-dimensional convolutional layer). (For interpretation of the references to color in this figure legend, the
reader is referred to the web version of this article.)

6
J. Wang, Y.L. Mo, B. Izzuddin et al. Computer Methods in Applied Mechanics and Engineering 414 (2023) 116184

Fig. 4. Schematic plot of tensor decomposition formulated by neural networks.

exactly reformulate the shape function of the quadratic truss element. Compared to the linear truss element, the
2
quadratic truss element has an additional shape function at the internal node NI+1/2 (x), which can be obtained by
multiplication of the linear truss element shape function as follows:
(x−xI )(x−xI+1 )

x I ≤ x ≤ x I +1
(xI+1/2 −xI )(xI+1/2 −xI+1 )

2
NI+1/2 (x) =
0 elsewhere (4)

= NI (x) · NI+1 (x) · (xI+1 −xI )2


(xI+1/2 −xI )(xI+1/2 −xI+1 )
As shown in Fig. 3(b), the input layer is the x coordinate of the node and it is fed into a fully connected layer and
two subsequent Conv1D layers. Subsequently, Layer No. 4 is a multiplication layer, where the stepwise functions
2
are multiplied to obtain the quadratic shape function at the interior point NI+1/2 as shown in Eq. (4). Layer No. 5 is
a 1D sum pooling layer with a stride of 3, kernel size of 2, and padding of 1, which will obtain shape function at
various nodes. After that, a Conv1D layer (Layer No. 6) is used to obtain the displacement field inside each element.
All trainable weights are in Layer No. 6, which are equivalent to the nodal displacements in FE methods. Finally,
the element displacements are assembled to the total displacement field of the truss structure. As shown in Fig. 3(b),
the shape function of the quadratic truss element can be obtained by multiplication of the linear shape function of
the truss element, which can be achieved by the PyTorch and Nvidia Modulus platform [25] by multiplication of
tensors to obtain the quadratic shape function of truss elements. Similarly, the shape function of higher-order truss
elements (with an order higher than 2) can be formulated in terms of neural networks by multiplication of linear
truss elements.

2.2. Tensor decomposition for 2D and 3D problems

For 2D and 3D static solid mechanics problems, Zhang et al. [31] proposed to use the tensor decomposition
(TD) method in the PINN framework to further reduce the dimension of unknowns. Fig. 4 shows the schematic
7
J. Wang, Y.L. Mo, B. Izzuddin et al. Computer Methods in Applied Mechanics and Engineering 414 (2023) 116184

plot of TD represented in terms of neural networks. TD was also known as canonical tensor decomposition [34],
which decomposes a tensor as a summation of rank-one tensors. For the 3D case, the governing equations of TD
are formulated as the following equations:
Q
X (q) (x)Y (q) (y)Z (q) (z)

u TD (x) = u TD (x, y, z) = (5)
q=1
n1
(q)
X (q) (x) =

N I (x)β I (6)
I =1
n2
(q)
Y (q) (y) =

N J (y)γ J (7)
J =1
n3
(q)
Z (q) (z) =

N K (z)θ K (8)
K =1
where Q denotes the total number of modes (typically ranging from 10 to 100 in numerical study); n1 , n2, and
(q) (q) (q)
n3 denote the number of nodes in the x, y, and z directions, respectively. β I , γ J and θ K denote the nodal
displacement weights in the x, y, and z directions in mode number q.
There are two motivations for adopting TD in the EPINN framework. First, the number of trainable parameters
for 3D solid mechanics problems can be reduced to Q×(n1 +n2 +n3 ), which may be notably lower than the number
of DOFs in FE methods (i.e., n1 ×n2 ×n3 ). Second, TD is efficient for the simulation of complicated shapes in solid
mechanics, where the mesh conforming to the shape of the complicated solid body is not needed. TD can be used
to generate the solution field of the box containing the simulation region, which notably reduces the difficulty in
model generation.

2.3. Principle of least work for the loss function

Consider the solid mechanics problems where the labeled data of the solution field are unavailable (i.e. forward
problem), the basic objective in the conventional PINN approach was to minimize the total loss as a sum of boundary
loss and the PDE loss as follows (2D plane stress case was illustrated here for simplicity) as shown below. For
conventional PINNs, the loss function based on weighted residuals is given as follows:
(⏐ ⏐ )
L= λ1 ∂ΩD ⏐u x − u ∗x ⏐Γ + ⏐u y − u ∗y ⏐Γ d∂ΩD
∫ ⏐ ⏐
D D

+λ2 ∂Ω N ⏐σx x cos 2 θ + σ yy sin 2 θ + 2τx y sin θ cosθ − σ ∗ ⏐ d∂ΩN


∫ ⏐ ⏐

+λ2 ∂Ω N ⏐ σ yy − σx x sin θ cosθ + τx y cos 2 θ − sin 2 θ − τ ∗ ⏐ d∂ΩN (9)


∫ ⏐( ) ( ) ⏐

+λ3 Ω ⏐σx x,x + σx y,y + f x∗ ⏐ + ⏐σx y,x + σ yy,y + f y∗ ⏐ dΩ


∫∫ (⏐ ⏐ ⏐ ⏐ )

⏐(λ + 2µ)εx x + λε yy − σx x ⏐ + ⏐(λ + 2µ)ε yy + λεx x − σ yy ⏐ + ⏐µγx y − τx y ⏐ dΩ


∫∫ (⏐ ⏐ ⏐ ⏐ ⏐ ⏐)
+ Ω
where ∂ΩD is the Dirichlet boundary condition (i.e. displacement boundary) of domain Ω , ∂ΩN is the Neumann
boundary (i.e. force boundary) of domain Ω , u ∗x and u ∗y are the displacement components in x and y direction at
displacement boundary condition, σ ∗ and τx∗y are the normal stress and shear stress at the force boundary, λ and µ
are Lame first and Lame second parameters of elastic material, and λ1 , λ2, and λ3 are three coefficients (similar to
the Lagrange multiplier in FE methods) to balance the loss term with different units.
The major issues of conventional PINN loss function in Eq. (9) are as follows. First, the loss function consists
of a constitutive equation, compatibility equation, and boundary loss, which have different units and needs to be
scaled by coefficients λ1 , λ2, and λ3 to formulate a reasonable loss function. Although the coefficients λ1 , λ2, and λ3
can be updated based on loss balancing approaches such as SoftAdapt [26], ReLoBRaLo [27], and GradNorm [28],
there is a lack of physical meaning for the coefficients λ1 , λ2 and λ3 in conventional loss balancing approaches
[26–28], and the training process involving loss balancing approaches may slow down the convergence of the PINN.
Second, because the PDE of static solid mechanics problems are elliptical PDE, the error of boundary condition will
propagate to the whole solution field at infinite speed. Therefore, the existence of a boundary condition error may
notably hinder the convergence of PINN in solving static solid mechanics problems. Although some approaches were
8
J. Wang, Y.L. Mo, B. Izzuddin et al. Computer Methods in Applied Mechanics and Engineering 414 (2023) 116184

proposed to generate a solution field to meet the Dirichlet boundary condition and Neumann boundary condition
simultaneously [15], it requires the PINN to predict the displacement field and stress field simultaneously, which may
increase the number of trainable parameters and may be difficult to extend to nonlinear constitutive models [32,33].
In this study, the principle of least work is selected as the loss function for static solid mechanics problems, and
the loss function is formulated as follows:
∫ (∫ ∫ )
) 1
σ (u) : ε (u) dΩ −
(
L u; f, t = u · fdΩ + u · td∂ΩN (10)
2 Ω Ω ∂ΩN
u = u∗∂ΩD + uT D · AD F p (11)
usolution = argminL u; f, t foru (x, y, z) = u∗∂ΩD ∀ (x, y, z) ∈ ∂ΩD
( )
(12)
where u is the displacement field, σ and ε are the stress and strain tensors, respectively, f is the body force and t
is the external traction applied to the force boundary. ADF denotes the Approximate Distance Function from any
point to the Dirichlet boundary of the simulation region [16], p is a positive real number and is fixed as 1.0 in
this study. uT D is the trial displacement field constructed following Eq. (5) in Section 2.2, and u∗∂ΩD denotes the
displacement field at the Dirichlet boundary condition.
EPINN adopts the loss function as the total energy of the system. Based on the principle of least work,
EPINN can efficiently solve solid mechanics problems by minimizing the total work of the system. The exact
displacement boundary condition was required by the principle of least work. The Dirichlet boundary condition
is satisfied throughout the training process by introducing Eq. (11), and the Neumann boundary condition and
force equilibrium equations can be satisfied after the training process converges. Compared to the Deep Ritz
method [13], HiDeNN [29,30], and HiDeNN-TD [31], the exact Dirichlet boundary condition is enforced in the
EPINN framework, which conforms to the requirement of the principle of least work. In addition, when calculating
the numerical integration, the number and coordinates of integration points may differ from the mesh size. It is
recommended that the number of integration points should exceed the number of mesh size in the EPINN framework
to evaluate the integration result efficiently.

2.4. Exact Dirichlet boundary condition

The exact Dirichlet boundary condition is achieved by implementing the following ADF, which was proposed [16]
as an alternative to the signed distance function (SDF). ADF is 0 on the Dirichlet boundaries and the first-order
derivative along the normal direction of the boundary equals 1. For complicated boundary shapes consisting of
various surfaces, it may be too hard to compute the derivative of SDF with respect to coordinates and the derivative
may be stiff, which may hinder the convergence of PINN. In comparison, ADF is a second-order smooth function
with respect to the coordinate, which is favorable for the stochastic gradient descent method to minimize the loss
function in Eq. (10). When there are multiple Dirichlet boundaries, the following equation is used to obtain the ADF
of the whole system to achieve an exact Dirichlet boundary condition. This approach ensures the displacement field
is exactly met on the Dirichlet boundary.
1
AD F = φ(φ1 , . . . , φn ) : = √ (13)
m 1 1 1
(φ1 )m
+ (φ2 )m
+ ··· + (φn )m

where φi denotes the approximate distance from the point (x,y,z) to boundary number i. On the displacement
boundary, φ = 0. m is a positive number and is selected as 2.0 in this study.

2.5. Meshless finite difference for efficient gradient information

Conventional PINN architecture mostly adopts automated differentiation to obtain the gradient of displacement
with respect to coordinates to formulate the strain field. However, the strain localization in some solid mechanics
problems may induce notably high gradient information and may hinder the convergence of conventional PINN
models. Recently, meshless finite differentiation (MFD) was developed in the Nvidia Modulus platform (v22.07) [25]
and numerical examples show that MFD can achieve a speedup of more than 50% based on conventional PINN
architecture [25] on typical mechanics problems. EPINN implemented MFD in the framework as an alternative to
9
J. Wang, Y.L. Mo, B. Izzuddin et al. Computer Methods in Applied Mechanics and Engineering 414 (2023) 116184

automated differentiation. The motivations are summarized below. First, in solid mechanics problems, the stress
localization effect may occur, while automated differentiation might induce notably high derivatives in the PINN
framework, while MFD may mitigate the influence of stress localization by pre-definition of increments in each
direction. Second, because the shape function of the 1D truss element requires the mesh size to be input as a
parameter, EPINN has the concept of mesh size in each direction, which is notably different from conventional
PINN models without an explicit definition of the mesh size. Therefore, when MFD is adopted, the spatial increment
of MFD is selected to be equal to the smallest mesh size in three directions in this study.

2.6. Comparison between EPINN with existing PINN architecture

The advantages of the EPINN model compared to the recurrent networks may be summarized as:
(1) Improved training speed due to reduced number of trainable weights: In conventional PINN architecture,
the number of trainable weights required to converge may be very high (exceeding 1 million trainable weights for
many problems), which may notably hinder the solution of solid mechanics problems. In EPINN, based on tensor
decomposition and shape function of the truss element, the number of trainable weights can be notably reduced and
training speeds are notably reduced accordingly.
(2) Extension to nonlinear problems: The EPINN model resembles the FE approach because only the
displacement field is obtained from neural networks, while the strain fields are obtained from MFD of the
displacement field, and the stress field is the output of the strain field. Because deep-learning-based constitutive
models [32] are rapidly developing, the pre-trained deep-learning-based constitutive models [32] can be infused
into EPINN architecture to achieve nonlinear simulation of solid mechanics problems.

3. Performance of EPINN framework for solid mechanics


In this section, three typical cases are illustrated for solving a 1D truss problem, a 2D plane stress problem, and
a 3D solid mechanics problem for the application of EPINN without the need for labeled data (which is the case
for the forward problem). This study successfully implemented EPINN with Nvidia Modulus [25] deep-learning
platform, which is open-source deep-learning software installed on the Osaka University supercomputer SQUID. All
models are trained using Intel Xeon Platinum 8368 CPU with a single Nvidia A100 40 GB SXM GPU accelerator at
Osaka University SQUID (Supercomputer for Quest to Unsolved Interdisciplinary Datascience). The reference PINN
architecture adopted a fully-connected neural network with 6 layers and a hidden layer size of 512 is used without
skip connections. For PINN in the reference group, the loss function adopts the conventional PINN approach. The
initial learning rate is 0.01 and the training precision was completed on TF32 precision [17] for both EPINN and
PINN approaches. The reference FE simulation results in Sections 3.2 and 3.3 was obtained using a single Intel
i7-10870H CPU using the double precision FE solver without GPU acceleration.

3.1. One-dimensional truss under complicated body force

Consider a 1D elastic truss problem reported by Zhang et al. [30] with an elastic modulus of 175, area of 1, and
length of 10 under complicated body force b(x) and fixed at both ends as shown in Fig. 5(a):
4π 2 (x − 2.5)2 − 2π 8π 2 (x − 7.5)2 − 4π
b(x) = − − (14)
eπ (x−2.5)2 eπ(x−7.5)2
The theoretical solution to this 1D problem is derived by Zhang et al. [30] as follows:
1 ( −π (x−2.5)2 ) 2 ( −π (x−7.5)2 ) e−6.25π − e−56.25π
u(x) = e − e−6.25π + e − e−56.25π − x (15)
AE AE 10AE
du 2 ( 2
) 4 ( 2
) e−6.25π − e−56.25π
σx x = E = −π e−π (x−2.5) (x − 2.5) + −π e−π (x−7.5) (x − 7.5) − (16)
dx A A 10A
For PINN architecture, maximum training steps are selected as 2,000,000 steps, and the learning rate decays
at a decay rate of 0.95 for every 1500 steps. For EPINN architecture, the maximum number of training steps is
selected as 50,000 steps, and the learning rate decays at a decay rate of 0.95 for every 500 steps. The mesh size was
selected as 0.1 and the total number of mesh is 100 for EPINN. The shape function of the quadratic truss element
10
J. Wang, Y.L. Mo, B. Izzuddin et al. Computer Methods in Applied Mechanics and Engineering 414 (2023) 116184

Fig. 5. Comparison of EPINN and PINN for simulating 1D truss problem.

was adopted in EPINN. The comparison of the learning curve (i.e., relative L2 error between theoretical solution
and PINN solution) is shown in Fig. 5(b). Based on training results, the EPINN achieved convergence after 427 s of
training with a displacement relative L2 error of 6e-3, while conventional PINN with fully-connected architecture
requires a significant training time of 5957 s to reach a displacement relative L2 error of 0.161. Therefore, EPINN
achieved a speedup of more than 13 times compared to conventional PINN for this truss problem. Fig. 5(c–g) further
compares the simulation results of the axial displacement field obtained from PINN and EPINN. Conventional PINN
may observe a boundary condition error with non-zero displacement at both ends, while the EPINN result did not
show this issue due to the exact Dirichlet boundary condition in the EPINN framework. Fig. 5(h–j) compares
11
J. Wang, Y.L. Mo, B. Izzuddin et al. Computer Methods in Applied Mechanics and Engineering 414 (2023) 116184

the stress field predicted by EPINN, which also converged with a slightly larger relative error compared to the
displacement field. Based on the comparison in Fig. 5, EPINN achieved an efficient solution of a 1D truss under
complicated body force.

3.2. Plane stress panel under eccentric tension

Rao et al. [15] reported a benchmark FE model and PINN architecture for the plane-stress panel. Fig. 6(a) shows
the FE simulation results using ABAQUS software as well as the boundary conditions. The panel is elastic material
with an elastic modulus of 10 and a Poisson ratio of 0.2. The panel length and height are both 1.0 and the bottom line
was fixed. The left half of the top surface was subject to vertical tension displacement uy of 0.1 while the horizontal
displacement uy is 0. All other lines are stress-free. ABAQUS implicit solver was used to obtain the solution at
160,000 reduced-integration plane stress elements at a mesh size of 1/400 and the large displacement option was
turned on. For conventional PINN with a fully connected network, maximum training steps are selected as 20,000
steps and the learning rate is decreasing by a decay rate of 0.95 for every 2000 steps. For EPINN architecture, the
maximum training steps are selected as 30,000 steps and the learning rate is decreasing by a decay rate of 0.95 for
every 500 steps. EPINN in this section adopts a mesh size of 1/50, mode number Q of 10 for tensor decomposition,
and first-order shape function of the truss element. Fig. 6(b) shows the time history L2 relative error of displacement
ux field from PINN and EPINN models compared to ABAQUS output. The ABAQUS solver converged after 265 s
of simulation time (using a single Intel i7-10870H CPU). In comparison, the EPINN solution also converged to
ABAQUS simulation results after a training time of 253 s (using a single Nvidia A100 GPU accelerator), which is
even faster than the ABAQUS model. In comparison, conventional PINN architecture did not converge after 5957
s of training. Fig. 6(c–e) shows the comparison between EPINN training results and FE simulation outcome of
the displacement ux field. Fig. 6(f–h) shows the comparison between EPINN training results and FE simulation
outcome of the displacement uy field. Based on the comparison, the EPINN model with the mesh size of 1/50 well
captures the displacement field of FE simulation, which was obtained at a finer mesh size of 1/400. Therefore, the
applicability and accuracy of EPINN for 2D plane stress cases are examined based on this case study.

3.3. Three-dimensional bracket

In this section, a three-dimensional problem is illustrated to show the performance of EPINN in solid mechanics.
Fig. 7(a) shows the schematic plot of the bracket. The bracket has the back face fixed and the shear stress is applied
to the front surface in the negative z direction. The traction force induces shear stress of 0.4 MPa and the rest of
the bracket was stress-free boundaries. The height, weight, and length of the bracket are equal to 1.0 m. The elastic
modulus is 100 GPa and the Poisson ratio is 0.3. Fig. 7(b) shows the finite element mesh in MATLAB using
quadratic tetrahedral elements with a mesh size of 0.003. A total of 1,157,983 nodes are generated in the MATLAB
FE model, and the linear solution of this reference FE model costs a total of 1140 s on CPU. For conventional
PINN with a fully connected network (6 layers, 512 neurons each layer, no skip connection), maximum training
steps are selected as 2,000,000 steps and the learning rate is reduced by a decay rate of 0.95 every 15,000 steps.
For EPINN architecture, the maximum training steps are selected as 10,000 steps and the initial learning rate is
set as 0.001. A total of 25 meshes are used for each direction (i.e. mesh size of 0.04) and the number of modes is
set to 20 for EPINN. The shape function of the linear truss element was used for EPINN. Fig. 7(c–e) shows the
relative L2 error of displacement components of PINN and EPINN compared to MATLAB FE results. As shown in
Fig. 7(c–e), EPINN rapidly converges to the MATLAB FE model within 685 s. In comparison, conventional PINN
with fully-connected architecture converged to a similar level of accuracy after 87052 s of training. Therefore,
EPINN achieved a speedup of 127 times compared to conventional PINN in this 3D solid example.
Fig. 8 shows the comparison of displacement simulation outcomes on the 3D point cloud. Fig. 8(a–c) shows the
EPINN simulation results of the displacement field in the x, y, and z directions. Fig. 8(d–e) shows the FE analysis
results. Fig. 8(f–h) shows the relative error defined as EPINN simulation results minus the FE analysis results. As
shown in Fig. 8, EPINN well captured the simulation results of FE results even with the mesh size of 0.04. In this
study, EPINN trained on Nvidia A100 GPU achieved faster convergence compared to FE analysis and achieved
more than 100 times speedup compared to the conventional PINN method. In general, EPINN rapidly converges to
the solution field of FE simulation results with similar computational time for the forward problem. Compared to
FE analysis, the gradient of the solution field with respect to input parameters can be obtained in real-time, which
is favorable for inverse problems and design problems.
12
J. Wang, Y.L. Mo, B. Izzuddin et al. Computer Methods in Applied Mechanics and Engineering 414 (2023) 116184

Fig. 6. Comparison of EPINN and PINN for simulating plane stress panel.

4. Discussion and extensions


4.1. Extension to nonlinear mechanical systems

The principle of least work is adopted in EPINN for solving static solid mechanics problems. For dynamic
problems, EPINN can also be further extended to solve the nonlinear static analysis and nonlinear dynamic analysis
problems with efficient access to gradient information. In dynamic analysis, FE software based on implicit solvers
generally adopts the implicit Newmark-β method with second-order accuracy, while high-order (typically fourth-
order) implicit Runge–Kutta methods are also adopted in some FE software. Fig. 9(a) shows the proposed EPINN
to reformulate the implicit Newmark-β method with second-order accuracy. The input layer contains the initial
boundary condition, and the trainable parameters are displacement vectors {U2 } to be solved in Step 2. The input
layer and {U2 } will be fed to neural networks to obtain the acceleration {A2 } and velocity {V2 }. The weights of
the neural network will be derived from the Newmark-β method. Subsequently, the internal force {R2 } will be
updated based on {U2 } based on the EPINN framework. Because this step is a static problem, it can be solved
by training EPINN. The loss function is formulated as the total error of the force equilibrium equation, which
will be minimized to solve displacement {U2 }. Fig. 9(b) shows the proposed EPINN architecture to reformulate
the alternative high-order implicit Runge–Kutta method with arbitrary q stages [9], which can increase stable time
13
J. Wang, Y.L. Mo, B. Izzuddin et al. Computer Methods in Applied Mechanics and Engineering 414 (2023) 116184

Fig. 7. Comparison of EPINN and PINN for simulating 3D bracket under uniform force boundary.

increments and exploit the parallel computing capability of GPUs by parallel internal force updating at q stages.
The trainable parameters are displacement and velocity at q stages. The internal force vectors at q stages will be
updated based on displacement vectors in parallel using GPUs. The loss function will be formulated following the
implicit Runge–Kutta method [35] and the optimization function can be used to solve the displacement and velocity
at q stages. Finally, the implicit Runge–Kutta method will infer the displacement {U2 } and velocity {V2 } at the end
of the multi-stage step. For both methods, the gradient of output fields with respect to major input parameters (such
as dimensions of components or external loads) can be obtained using the automated backpropagation of the neural
network efficiently.
In addition, causal training [36] has also been proposed in the PINN research field. Causal training is another
method to ensure the continuous-time PINN models obey causality. This method discretized the time domain and
defined the temporal residual loss. The temporal residual loss ensures the training process will first train the PINN
for initial short time period, and the training of future time will only start upon convergence of previous time steps.
Therefore, the future development includes extending EPINN to include the causal training technique to extend to
nonlinear problems.

4.2. Extension to super-resolution EPINN (SPINN)

In the computational mechanics research field, the super-resolution networks [8,37] were also rapidly adopted in
turbulence super-resolution. In addition, the Fourier Neural Operator (FNO) also achieves zero-shot super-resolution
performance [11]. This study proposes to infuse EPINN with a super-resolution network to achieve high-fidelity
PINN with an acceptable computational cost. The motivation is that EPINN has a mesh size of x, y, and z direction.
The mesh size needs to be fixed before training. After training is complete, when finer results of the solution field
are needed, a super-resolution network can be adopted based on the following steps:
14
J. Wang, Y.L. Mo, B. Izzuddin et al. Computer Methods in Applied Mechanics and Engineering 414 (2023) 116184

Fig. 8. Comparison of EPINN and PINN for simulating 3D bracket under uniform force boundary (in units of mm).

(1) Develop solid mechanics datasets including FE analysis data, test data, and EPINN simulation data.
(2) Pre-train super-resolution networks [8,37] based on solid mechanics dataset.
(3) For specific solid mechanics problems, use EPINN to solve solid mechanics problems with low-resolution
grid size.
(4) Based on the solution of EPINN at a low-resolution grid, use a pre-trained super-resolution network to infer
a high-resolution solution field.

5. Conclusions
To achieve efficient simulation of solid mechanics problems with reduced computational cost, the EPINN
framework is proposed in this study, which adopts the exact Dirichlet boundary condition and principle of least work
for the simulation of solid mechanics problems. There is no requirement for additional labeled data of the solution
field and notably reduces the training cost compared to recurrent networks. Even when no labeled data of the solution
field are input, the EPINN architecture can directly solve the solid mechanics problems by minimizing the PDE loss.
The future extensions of the EPINN model into nonlinear dynamic simulation and super-resolution networks are also
discussed. (5) Because EPINN is a special version of PINN, it can be extended to parametric simulation, structural
health monitoring problems, and design optimization problems with higher flexibility compared to conventional
finite element methods. The major conclusions are summarized as follows:
(1) The EPINN framework adopts the principle of least work and exactly meets the Dirichlet boundary condition
throughout the training process, while the Neumann boundary condition and governing equations (PDEs) are met
15
J. Wang, Y.L. Mo, B. Izzuddin et al. Computer Methods in Applied Mechanics and Engineering 414 (2023) 116184

Fig. 9. Proposed EPINN architecture to reformulate implicit time integration methods for dynamic analysis.

when the training converges. Based on the shape function of the 1D truss element and tensor decomposition theory,
the number of trainable weights can be reduced notably compared to conventional PINN architecture. The number
of trainable weights can be even smaller than the DOFs in the FE method at the same spatial resolution.
(2) For the 1D truss problem, EPINN achieved a speedup of 13 times compared to conventional PINN architecture
adopting a fully connected neural network in the case study with notably reduced error.
(3) For the 2D plane stress problem, EPINN rapidly converged to ABAQUS FE simulation results with a
similar solution time compared to ABAQUS implicit solver while conventional PINN did not converge for the
same problem.
16
J. Wang, Y.L. Mo, B. Izzuddin et al. Computer Methods in Applied Mechanics and Engineering 414 (2023) 116184

(4) For the 3D solid mechanics problem, EPINN efficiently converged to FE simulation results with lower training
time than the FE solver. EPINN achieved a speedup of 127 times compared to the conventional PINN model with
a reasonable level of accuracy.

Declaration of competing interest


The authors declare that they have no known competing financial interests or personal relationships that could
have appeared to influence the work reported in this paper.

Data availability
Data will be made available on request.

Acknowledgments
The authors gratefully acknowledge the financial support provided by the University of Hong Kong Start-up Fund
for New Staff and Seed Fund for Basic Research for New Staff. The authors gratefully acknowledge the financial
support provided by the JSPS (Japan Society for the Promotion of Science) postdoctoral fellowship and Grant-in-Aid
for JSPS Fellows. The authors gratefully acknowledge the support provided by Osaka University Supercomputer
SQUID (Supercomputer for Quest to Unsolved Interdisciplinary Datascience).

References
[1] G.E. Karniadakis, I.G. Kevrekidis, L. Lu, P. Perdikaris, S. Wang, L. Yang, Physics-informed machine learning, Nat. Rev. Phys. 3
(2021) 422–434.
[2] J.-J. Wang, C. Liu, X. Nie, J.-S. Fan, Y.-J. Zhu, Nonlinear model updating algorithm for biaxial reinforced concrete constitutive models
of shear walls, J. Build. Eng. (2021) 103215.
[3] J.-J. Wang, Y.-F. Liu, X. Nie, Y.L. Mo, Deep convolutional neural networks for semantic segmentation of cracks, Struct. Control.
Health Monit. (2021) 1–18.
[4] A. Yazdani, L. Lu, M. Raissi, G.E. Karniadakis, Systems biology informed deep learning for inferring parameters and hidden dynamics,
PLoS Comput. Biol. 16 (2020) e1007575.
[5] R. Rombach, A. Blattmann, D. Lorenz, P. Esser, B. Ommer, High-resolution image synthesis with latent diffusion models, in: CVPR
2022, 2022.
[6] K. Simoyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, in: Proc. International Conference on
Learning Representations (ICLR 2015), San Diego, CA, 2015.
[7] S. Bai, J.Z. Kolter, V. Koltun, An empirical evaluation of generic convolutional and recurrent networks for sequence modeling, 2018.
[8] N.S. Ashish Vaswani, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, Illia Polosukhin, Attention is all
you need, in: Proc. Advances in Neural Information Processing Systems 30 (NIPS 2017), Long Beach, CA, 2017.
[9] M. Raissi, P. Perdikaris, G.E. Karniadakis, Physics-informed neural networks: A deep learning framework for solving forward and
inverse problems involving nonlinear partial differential equations, J. Comput. Phys. 378 (2019) 686–707.
[10] N. Long, R. Maziar, P. Seshaiyer, Efficient physics informed neural networks coupled with domain decomposition methods for solving
coupled multi-physics problems, Adv. Comput. Modeling Simul. (2022) 41–53.
[11] Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stuart, A. Anandkumar, Fourier nerual operator for parametric
partial differential equations, in: International Conference on Learning Representations, ICLR, 2021.
[12] K. Hornik, M. Stinchcombe, H. White, Multilayer feedforward networks are universal approximators, Neural Netw. 2 (1989) 359–366.
[13] W. E, B. Yu, The Deep Ritz Method: A Deep learning-based numerical algorithm for solving variational problems, Commun. Math.
Stat. 6 (2018) 1–12.
[14] E. Haghighat, M. Raissi, A. Moure, H. Gomez, R. Juanes, A physics-informed deep learning framework for inversion and surrogate
modeling in solid mechanics, Comput. Methods Appl. Mech. Engrg. 379 (2021).
[15] C. Rao, H. Sun, Y. Liu, Physics informed deep learning for computational elastodynamics without labeled data, J. Eng. Mech. 147
(2021).
[16] N. Sukumar, A. Srivastava, Exact imposition of boundary conditions with distance functions in physics-informed deep neural networks,
Comput. Methods Appl. Mech. Engrg. 389 (2022).
[17] S.N. Oliver Hennigh, Mohammad Amin Nabian, Akshay Subramaniam, Kaustubh Tangsali, Zhiwei Fang, Max Rietmann, Wonmin
Byeon, Sanjay Choudhry NVIDIA SimNet™: An AI-accelerated multi-physics simulation framework, in: Proc. International Conference
on Computational Science, 2021, pp. 447–461.
[18] N. Rahaman, A. Baratin, D. Arpit, F. Draxler, M. Lin, F.A. Hamprecht, Y. Bengio, A. Courville, On the spectral bias of neural
networks, in: 36th International Conference on Machine Learning, California, 2019.
[19] S. Wang, Y. Teng, P. Perdikaris, Understanding and mitigating gradient flow pathologies in physics-informed neural networks, SIAM
J. Sci. Comput. 43 (2021) 3055–3081.
[20] R.K. Srivastava, K. Greff, J. Schmidhuber, Training very deep networks, in: Advances in Neural Information Processing Systems, 2015.
17
J. Wang, Y.L. Mo, B. Izzuddin et al. Computer Methods in Applied Mechanics and Engineering 414 (2023) 116184

[21] S. Wang, H. Wang, P. Perdikaris, On the eigenvector bias of fourier feature networks: From regression to solving multi-scale pdes
with physics-informed neural networks, Comput. Methods Appl. Mech. Engrg. 384 (2021).
[22] V. Sitzmann, J.N.P. Martel, A.W. Bergman, D.B. Lindell, G. Wetzstein, Implicit neural representations with periodic activation functions,
in: Proc. 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada, 2020.
[23] J. Sirignanoa, K. Spiliopoulosb, DGM: A deep learning algorithm for solving partial differential equations, J. Comput. Phys. 375
(2018) 1339–1364.
[24] R. Fathony, A.K. Sahu, D. Willmott, J.Z. Kolter, Multiplicative filter networks, in: International Conference on Learning Representations
(ICLR) Vienna, Austria, 2021.
[25] Nvidia, 2022, https://docs.nvidia.com/deeplearning/modulus/index.html.
[26] A.A. Heydari, C.A. Thompson, A. Mehmood, SoftAdapt: Techniques for Adaptive Loss Weighting of Neural Networks with Multi-Part
Loss Functions, 2019, Preprint.
[27] R. Bischof, M. Kraus, Multi-Objective Loss Balancing for Physics-Informed Deep Learning, 2021, Preprint.
[28] Z. Chen, V. Badrinarayanan, C.-Y. Lee, A. Rabinovich, GradNorm: Gradient normalization for adaptive loss balancing in deep multitask
networks, in: Proc. 35 Th International Conference on Machine Learning, Stockholm, Sweden, 2018.
[29] S. Saha, Z.T. Gan, L. Cheng, J.Y. Gao, O.L. Kafka, X.Y. Xie, H.Y. Li, M. Tajdari, H.A. Kim, W.K. Liu, Hierarchical Deep Learning
Neural Network (HiDeNN): An artificial intelligence (AI) framework for computational science and engineering, Comput. Methods
Appl. Mech. Engrg. 373 (2021) 28.
[30] L. Zhang, L. Cheng, H.Y. Li, J.Y. Gao, C. Yu, R. Domel, Y. Yang, S.Q. Tang, W.K. Liu, Hierarchical deep-learning neural networks:
finite elements and beyond, Comput. Mech. 67 (2021) 207–230.
[31] L. Zhang, Y. Lu, S. Tang, W.K. Liu, HiDeNN-TD: Reduced-order hierarchical deep learning neural networks, Comput. Methods Appl.
Mech. Engrg. 389 (2022).
[32] J.-J. Wang, C. Wang, J.-S. Fan, Y.L. Mo, A deep learning framework for constitutive modeling based on Temporal Convolutional
Network, J. Comput. Phys. 449 (2022).
[33] C. Wang, L. y. Xu, J. s. Fan, A general deep learning framework for history-dependent response prediction based on UA-Seq2Seq
model, Comput. Methods Appl. Mech. Engrg. 372 (2020) 113357.
[34] T.G. Kolda, B.W. Bader, Tensor decompositions and applications, SIAM Rev. 51 (2009) 455–500.
[35] J.C. Butcher, Numerical Methods for Ordinary Differential Equations, third ed., Wiley, 2016.
[36] Sifan Wang, Shyam Sankaran, Paris Perdikaris, Respecting causality is all you need for training physics-informed neural networks,
2022, arXiv preprint arXiv:2203.07404.
[37] S. Anwar, S. Khan, N. Barnes, A deep journey into super-resolution: A survey, ACM Comput. Surv. 53 (2021) 1–34.

18

You might also like