A Machine Learning Approach To Fracture Mechanics Problems
A Machine Learning Approach To Fracture Mechanics Problems
Xing Liu
School of Engineering, Brown University
Providence, RI 02912, USA
[email protected]
Christos E. Athanasiou
School of Engineering, Brown University
Providence, RI 02912, USA
[email protected]
Nitin P. Padture
School of Engineering, Brown University
Providence, RI 02912, USA
[email protected]
Brian W. Sheldon*
Corresponding author
School of Engineering, Brown University
Providence, RI 02912, USA
[email protected]
Huajian Gao*
Corresponding author
School of Engineering, Brown University
Providence, RI 02912, USA
College of Engineering, College of Science
Nanyang Technological University, Singapore
+65 6790 9514
[email protected]
A machine learning approach to fracture mechanics problems
Xing Liu1, Christos E. Athanasiou1, Nitin P. Padture1, Brian W. Sheldon1, Huajian Gao1,2
Singapore
Abstract
However, they are not always accessible in complex problems. A new class of solutions,
based on machine learning (ML) models such as regression trees and neural networks
(NNs), are proposed and their feasibility and value are demonstrated through the analysis
on regression trees and NNs can provide accurate results for the specific problem, but
simplicity. This example demonstrates that ML solutions are a major improvement over
analytical and empirical solutions in terms of both reliable functionality and rapid
deployment. When analytical solutions are not available, the use of ML solutions can
overcome the limitations of empirical solutions and substantially change the way that
1
Keywords
1. Introduction
Engineers often seek analytical solutions for simplicity and reliability, which brings
structural analysis and design. However, analytical solutions cannot always be obtained.
empirical solutions are provided to evaluate the plane-strain stress intensity factor at the
crack tip. Both analytical and empirical solutions are known for their rapid deployment
empirical solution. In the case where neither analytical nor empirical solutions are
feasible the following question can be raised: is there any other possible way to obtain a
solution? The computers are endowed with a strong learning ability by machine learning
(ML) algorithms, and, therefore, they could learn by themselves from experiments and
informed directly by experiments and simulations [1,2,3], and thus provide “machine
learning solutions”. These ML solutions could serve as are a promising substitute for
2
analytical and empirical solutions if it they can provide rapid and accurate results. In this
context, the initial study presented here, summarized in Fig. 1, uses ML solutions for
Figure 1. ML solutions to engineering problems. (a) Analytical solutions are developed based on a
physical understanding of the problem, which can be inaccessible in complicated cases. An acceptable
compromise that has been used traditionally relies on fitting abundant experimental and/or numerical data
to an empirical solution. This approach can have limited accuracy when dealing with a nonlinear and
complex relationship among high-dimensional data. Therefore, the ML solutions are proposed to overcome
this weakness and provide accurate results rapidly. ML solutions are provided by machines as they learn
from experimental and numerical data and could serve as a promising substitute when analytical and
empirical solutions are not feasible. (b) The specific engineering problem addressed in this work:
determination of fracture toughness by loading (using a nanoindenter) a pre-notched pentagonal cross-
section microcantilever at its end. The microcantilever is milled out of the bulk material, whose dimensions
are , , , , . (c) The ratio of the crack tip plane-strain stress intensity factor, , to the indentation
load, , are is evaluated from empirical solutions [4,5], FEM simulations and ML solutions for different
cantilever lengths, , and , , , = 5.20,4.80,0.90,1.00 . The ML solution can achieve
comparable accuracy to FEM results when the empirical solution fails.
3
2. Experimental
This investigation is based on a mechanical test that is used to measure the mode-I
section microcantilever is cut with focused ion beam (FIB) micromachining in Helios
NanoLab 450 system (FEI, Oregon, USA), as shown in Fig. 1b. To induce a well-defined
controlled fracture event, a sharp pre-notch is milled at a distance, , from the fixed end
i.e., a coarse cut with a moderate current followed by a second finer cut with a lowest
possible current, 1 pA. This strategy restricts the notch radius well below 50 nm and
eliminates the artifacts caused by ion-implantation damage around the notch tip during
FIB milling. In this way, the notch tip is sharp enough to behave like an ideal crack and
ensure the validity of the measurement [8,9]. After the microcantilever fabrication, a
Berkovich tip is used to apply a controlled load at the free end of the cantilever and the
inherently unstable, a catastrophic crack growth would be observed, i.e., a peak load
followed by a displacement burst. Therefore, the fracture toughness is correlated with the
4
toughness measurements are summarized in Fig 2b. The specimen sizes are sufficiently
larger than the grain sizes, thus a homogeneous polycrystalline microstructure is sampled
Figure 2. Fracture toughness tests on polycrystalline silicon specimens. (a) The scanning electron
microscope (SEM) image shows the exact geometry of the polysilicon specimens. (b) Fracture toughness
values of these three specimens were evaluated from empirical solutions, FEM simulations and ML
solutions for comparison. The dimensions of the polysilicon specimens are listed in Table A.1.
3. Theory/calculation
fracture mechanics and requires a solution that describes the plane-strain stress intensity
factor at the crack tip for a given load. The exact microcantilever geometry is limited by
practical considerations related to the use of the FIB instrument and by the need to obtain
a well-defined controlled fracture event during loading. With the geometries that are
typically used, it is impossible to derive analytical or empirical solutions that are highly
accurate over the full range of relevant sample dimensions. An early attempt [4,5], based
significantly from the results of finite element method (FEM), whose accuracy is
5
guaranteed as shown in Fig. 1c. It is certainly possible to obtain accurate solutions by
running an FEM simulation for each specimen. However, this is time-consuming and
impractical if there are a large number of specimens with variations in the specimen
toughness. In this context, a ML solution can provide a valuable tool to accelerate the
implementation of a specific fracture toughness test. More generally, this provides a tool
accuracy (comparable to FEM simulations). For the example investigated here, the ML
solution should capture the relationship among the plane-strain stress intensity factor, ,
at the notch tip, the indentation load, , and the specimen dimensions, , , , , .
Developing a ML solution requires a data set that includes desired inputs and outputs
for a large number of samples. These data are then used to train appropriate ML models,
which can evolve to optimize performance. The quality and quantity of the data dictate
the accuracy that can be achieved. Therefore, it is of great importance to build a good
data set, comprising a combination of both input and target data. The way that these data
are defined can greatly affect the ML process. The fracture toughness measurements used
here are assumed to strictly follow the assumptions of linear-elastic fracture mechanics,
such that the calculation of the plane-strain stress intensity factor, , is a boundary value
problem in linear elasticity. The boundary conditions of the problem imply that the crack
tip plane-strain stress intensity factor, , is directly proportional to the indentation load,
, and the ratio between them depends only on the specimen dimensions, , , , , ,
6
Dimensional analysis indicates that five independent dimensionless variables,
, , , ;&
! " $%
!.' (" () , are relevant. Based on this, the following structure of the ML
"
solutions is employed:
, Input variables: - = .- , -/ , -0 , -1 2 = 3 , , , 4
. .12
+Target variable: 5 =
* .6 7 7/
Based on the specimen sizes used for typical measurements, the parameter space of
In order to fully explore the parameter space of input variables, a grid-search strategy
is adopted when the data set for ML is generated. Each input domain -< in
.- , -/ , -0 , -1 2 = = , , !
, ">, is discretized into < uniform intervals and a grid of all
possible input variables is then constructed inside the parameter space. This gives ? =
. , /, 0, 12 = .35, 100, 10, 102 is then used to generate a data set for ML with
? = 439,956 samples. Since the target variable in this problem is continuous and
bounded over the whole input parametric space, the accuracy of the ML solutions over
the continuous space can be estimated in a reliable manner by sampling these discrete
points.
FEM is used to evaluate the target variable 5 for each sample. The direct domain J-
integral [10,11] and elastic compliance [12] are the two conventional methods to evaluate
the energy release rate, B , and the plane-strain stress intensity factor, =
7
C B ⁄.1 − / 2, at the crack tip. This first evaluates B from the direct domain J-integral.
The elastic compliance method calculates the rate of change of the compliance with crack
extension for a given indentation load, which then gives and hence the rate of potential
energy loss, which is equivalent to B. Both methods provide an accurate value of the
computational cost of the J-integral method is significantly lower than the elastic
compliance method since the latter requires multiple simulations for different crack
especially when the size of the data set is huge, and thus the J-integral method is clearly
the best choice. This was implemented through user-defined subroutines in FEAP [13].
An encastre boundary condition were applied to the fixed end of the cantilever and a
displacement-controlled boundary condition to the free end. More than 10,000 full
integration elements (C3D20) were employed with the mesh being refined near the crack
tip. The average wall time for each simulation was 120 seconds when running in parallel
≈ 31 days,
10F,F6G⋅ / ⋅ G
0/ ⋅G ⋅G ⋅/1
on 16 CPUs. The generation of the whole data set took roughly
with 20 jobs running simultaneously on 320 CPUs. The elapsed time can be further
There is a wide variety of ML models and algorithms, and one needs to find the most
appropriate one for their specific problems. After screening the popular ML models, it is
found that regression trees (RTs) and their ensembles, neural networks (NNs) are most
8
4. Results
The tree-based method for regression can capture the highly non-linear relationship
regression tree is built through dividing the input parametric space into distinct, non-
overlapping regions known as leaf nodes. Each leaf node is labeled with the average
value of targets of the training samples that fall into the region, as shown in Fig. 3. A
given input would flow from the root node, through the internal branch nodes, into the
terminal leaf node, and produce a prediction of target variable by the label of the leaf
the open-source package Scikit-learn [15] to grow the regression trees. The complexity
and size of a grown regression tree is controlled by the maximum tree depth and the
minimum leaf size. The tree depth represents the maximum number of edges from the
root node to the leaf node and is limited to a moderate interval 4~8. The leaf size
represents the number of samples required for each leaf node and a reasonable lower
bound is set to 50. Using these parameters overfitting can be effectively eliminated.
The structure and performance of the regression tree solutions are summarized in Fig.
3 and Table A.2. The accuracy of the solutions is measured by the maximum value of the
over all the samples in the data set, where 5QNPR is the target value predicted by the ML
solution and 5MNOP is the target value of the data set which is initially obtained from FEM
9
simulations. As a limit is set on the minimum leaf size, the accuracy of these regression
trees gets saturated before reaching an acceptable low value, as shown in Fig. 3 and Table
10
A.2.
Although a single regression tree is not powerful enough to address the complex
regions; each region is represented by a leaf node and corresponds to a value of the prediction of target
variable. (b) A regression-tree ensemble combines a set of weak regression trees and make predictions by
aggregating responses from each single regression tree. (c) Ensemble methods, such as GBRT, can achieve
much higher accuracy (i.e., smaller absolute percentage error) than a single regression tree.
11
problem, it can be used as a building block to construct a better model with stronger
learning ability, i.e., a regression tree ensemble, as shown in Fig. 3. Given an input of
then synthesized into a final prediction of the target variable 5. Gradient Tree Boosting
algorithm [16,17] is implemented to build the regression tree ensembles which allows for
A series of gradient-boosted regression trees (GBRT) with least squares loss are
constructed with different numbers and sizes of the basic regression trees, and their
structures and performance are summarized in Fig. 3 and Table A.2. Interestingly, the
Thus, the learning capability of the regression tree ensemble method in producing an
ensemble typically contains a large number of basic trees which drastically increases the
NNs are computing systems with interconnected nodes inspired by biological neural
networks in the human brain. NNs are known for their power in approximating the
output layer, and hidden layers in-between. These layers are interconnected to form a
12
in one layer is connected to all the nodes in the adjacent layers and information is fed
between layers only in the forward direction. As the input data - = .- , -/ , -0 , -1 2 is fed
into the NN through the input layer, each node in the next hidden layer processes data
from the input layer and feeds the next layer through an activation function. Finally, the
output layer collects data from the last hidden layer and produce the target data, 5. The
nodes in each hidden layer. Simple NNs with 1 or 2 hidden layers with the rectified linear
activation function (ReLU) “ReLU” activation function, i.e., single-layer and multilayer
robust ML algorithm, such that, they can learn and evolve continuously as new data is fed
to them. Nadam algorithm [19] with Log-Cosh loss function is adopted for the training of
NNs using the open-source platform TensorFlow r2.0 [20]. To fully exploit the learning
capability of NNs, a large portion of the data set is utilized for training while keeping a
small portion for validation and test. Therefore, the data set is split into training data set
(70%), validation data set (15%), and test data set (15%). By collecting and reviewing the
accuracy of these three data sets during the training process as shown in Fig. A.1, it is
demonstrated that all the NNs are converged and not overfitted.
The structure and performance of the NN-based solutions are summarized in Fig. 4
and Table A.3. It turns out that solutions with high accuracy can be achieved even with
simple NNs: a NN with only 29 neurons (4/16/8/1) can produce an accuracy of less than
solutions to tune the accuracy according to the application needs. Surprisingly, by using a
13
large data set, i.e., ? = 439,956, even with a small number of neurons, i.e., 29, the
Figure 4. NN-based solutions. A NN is composed of an input layer, hidden layers, and an output layer. In
perceptrons denotated “ V< ⁄V ⁄VW ”, and (b) multilayer perceptrons denotated “ V< ⁄V ⁄V/ ⁄VW ”, where
V< , V , V/ , VW are the number of nodes in the input layer, hidden layers, and output layer, respectively. (c)
ML solutions with high accuracy (i.e., small absolute percentage error) can be obtained even with simple
NNs. One can choose appropriate NN-based solutions to tune the accuracy according to the application
needs.
14
4.3 Portable deployment of ML solutions
can provide accurate results for the specific problem. However, NN-based solutions
outperform regression-tree-based solutions in terms of their simplicity (Fig. 3 and Fig. 4).
hundreds of regression trees and each regression tree consists of hundreds of nodes.
solution. In contrast, a NN-based solution can achieve the same level of accuracy only
Javascript object notation (JSON), and shared among researchers and engineers in the
community. The standard data-interchange format sets the scene for researchers to
user interface can be further developed for accessing and visualizing the problem and the
established standard ML solution. In this context, the ML solution for the fracture
toughness measurements has been created and shared through a web-based application,
named “SIF Calculator” [21]. The ML solution can achieve comparable accuracy to FEM
results as shown in Fig. 1c and Fig. 2b. The overview of the deployment and positioning
15
Figure 5. Functionality and positioning of a ML solution to engineering problems. The optimal ML
solution can be exported in an open-standard file format, such as JSON, and shared among researchers for
further operations and development. It can also be established as an engineering standard and stored online
for public access. Cloud or web-based applications can be developed to interactively provide fast and
accurate solutions to engineers.
5. Discussion
promising substitute when analytical and empirical solutions are not accessible. The
feasibility and advantage of the ML solutions are demonstrated through the application in
16
methods [22,23,24,25] and indentation-based methods [26,27,28]. The development
quantities are involved. Dimensional analysis can assist in finding all the relevant
dimensionless numbers from these quantities, and the essence of the solution is to
2. Preparation of a data set. The dimensionless quantities involved can be assorted into
input variables and target variables based on their physical meanings. It is necessary to
explicitly define the parameter space of the input variables, which covers the whole
region of interest (ROI). Since the reliability of the ML solutions is of high priority, these
solutions are restrained from doing extrapolation and the performance within ROI is
emphasized. The quality of the data set directly controls the accuracy of the solutions.
3. Model selection and training. There are a large variety of ML models that can be
employed. Based on the data set and type of problem, the appropriate ML models should
be selected. The optimal solution is achieved by evaluating their accuracy and simplicity.
community in an open-standard format. Engineers can obtain rapid and accurate results
by interacting with cloud or web-based applications which can be integrated with the ML
solution.
17
6. Conclusions
advantages in dealing with the nonlinear and complex relationship among high-
simplicity of employing ML solutions also paves the way creates a foundation for the
materials science. Possible examples include e.g., predictions of surface roughness and
temperature rise (target variables) in metal cutting under different machining conditions
Therefore, ML solutions can substantially change the way that engineering problems are
solved.
Acknowledgements
We acknowledge the financial support from the US Department of Energy Basic Energy
Sciences Grant # DE-SC0018113 for this research. This work was conducted using the
computational resources and services at the Brown University Center for Computation
and Visualization.
18
Appendix A
X Y Z [\ [] ^_` bcd .? ⋅ .6
2
. 2 . 2 . 2 . 2 . 2 . a2 Empirical1 FEM ML
PolySi 1 5.00 0.90 5.00 1.00 10.00 2.30 1.73 0.97 0.97
PolySi 2 4.80 0.90 5.20 1.00 10.00 2.35 1.77 0.99 0.99
PolySi 3 5.20 1.00 5.50 1.00 11.00 2.50 1.78 0.98 0.98
1
The empirical solution was developed in [4, 5].
19
Table A.2. Construction and performance of regression-tree-based solutions.
RT 1 5 32 3240 23.828%
RT 1 6 64 1296 19.982%
RT 1 8 256 88 17.146%
20
Table A.3. Construction and performance of NN-based solutions.
21
Figure A.1. Training history of NNs. The maximum absolute percentage error (max. APE) on the training
and validation data sets are recorded over the number of epochs for some simple NNs: (a) 4/16/1, (b)
4/32/1, (c) 4/64/1, (d) 4/128/1, (e) 4/256/1, (f) 4/512/1, (g) 4/8/8/1, (h) 4/8/16/1, (i) 4/16/8/1, (j) 4/16/16/1,
(k) 4/32/32/1 and (l) 4/64/64/1. Each plot shows that the training and validation accuracy reach the same
plateau, which indicates that the NN is converged and not overfitted.
22
References
23
[11] T.D. Nguyen, S. Govindjee, P.A. Klein, H. Gao, A material force method for
inelastic fracture mechanics, Journal of the Mechanics and Physics of Solids 53 (2005)
91-121. https://doi.org/10.1016/j.jmps.2004.06.010.
[12] T.L. Anderson, Fracture Mechanics: Fundamentals and Applications, CRC Press,
Boca Raton, 2005.
[13] R. L. Taylor, FEAP - A Finite Element Analysis Program, University of California,
Berkeley, 2014.
[14] L. Breiman, J.H. Friedman, R.A. Olshen, C.J. Stones, Classification and Regression
Trees, CRC, Boca Raton, 1984.
[15] F. Pedregosa et al., Scikit-learn: Machine learning in Python, Journal of Machine
Learning Research 12 (2011) 2825-2830.
[16] J.H. Friedman, Greedy function approximation: a gradient boosting machine, Annals
of Statistics 29 (2001) 1189-1232.
[17] J.H. Friedman, Stochastic gradient boosting, Computational Statistics & Data
Analysis 38 (2002) 367-378. https://doi.org/10.1016/S0167-9473(01)00065-2.
[18] K. Hornik, Approximation capabilities of multilayer feedforward networks, Neural
Networks 4 (1991) 251-257. https://doi.org/10.1016/0893-6080(91)90009-T.
[19] T. Dozat, Incorporating Nesterov Momentum into Adam, Proceedings of the
International Conference on Learning Representation (2018).
[20] M. Abadi et al., TensorFlow: large-scale machine learning on heterogeneous
distributed systems, Preprint at http://arxiv.org/abs/1603.04467 (2016).
[21] X. Liu, SIF Calculator. https://hint1412.github.io/XLiu.github.io/SIF/, 2019
(accessed 30 December 2019).
[22] D.E. Armstrong, A.J. Wilkinson, S.G. Roberts, Measuring anisotropy in Young’s
modulus of copper using microcantilever testing, Journal of Materials Research 24 (2009)
3268-3276. https://doi.org/10.1557/jmr.2009.0396.
[23] M.G. Mueller et al., Fracture toughness testing of nanocrystalline alumina and fused
quartz using chevron-notched microbeams, Acta Materialia 86 (2015) 385-395.
https://doi.org/10.1016/j.actamat.2014.12.016.
24
[24] G. Žagar, V. Pejchal, M.G. Mueller, L. Michelet, A. Mortensen, Fracture toughness
measurement in fused quartz using triangular chevron-notched micro-cantilevers, Scripta
Materialia 112 (2016) 132-135. https://doi.org/10.1016/j.scriptamat.2015.09.032.
[25] A.D. Norton, S. Falco, N. Young, J. Severs, R.I. Todd, Microcantilever investigation
of fracture toughness and subcritical crack growth on the scale of the microstructure in
Al2O3, Journal of the European Ceramic Society 35 (2015) 4521-4533.
https://doi.org/10.1016/j.jeurceramsoc.2015.08.023.
[26] X. Wang et al., High damage tolerance of electrochemically lithiated silicon, Nature
Communications 6 (2015) 8417. https://doi.org/10.1038/ncomms9417.
[27] M. Sebastiani, K.E. Johanns, E.G. Herbert, F. Carassiti, G.M. Pharr, A novel pillar
indentation splitting test for measuring fracture toughness of thin ceramic coatings,
Philosophical Magazine 95 (2015) 1928-1944.
https://doi.org/10.1080/14786435.2014.913110.
[28] M. Sebastiani, K.E. Johanns, E.G. Herbert, G.M. Pharr, Measurement of fracture
toughness by nanoindentation methods: Recent advances and future challenges, Current
Opinion in Solid State and Materials Science 19 (2015) 324-333.
https://doi.org/10.1016/j.cossms.2015.04.003.
25