0% found this document useful (0 votes)
106 views14 pages

Machine Learning Forecasting Models of Disc Cutters Life of Tunnel Boring Machine

Uploaded by

Jovan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
106 views14 pages

Machine Learning Forecasting Models of Disc Cutters Life of Tunnel Boring Machine

Uploaded by

Jovan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Automation in Construction 128 (2021) 103779

Contents lists available at ScienceDirect

Automation in Construction
journal homepage: www.elsevier.com/locate/autcon

Machine learning forecasting models of disc cutters life of tunnel


boring machine
Arsalan Mahmoodzadeh a, *, Mokhtar Mohammadi b, Hawkar Hashim Ibrahim c,
Sazan Nariman Abdulhamid c, Hunar Farid Hama Ali a, Ahmed Mohammed Hasan c,
Mohammad Khishe b, Hoger Mahmud d
a
Department of Civil Engineering, University of Halabja, Halabja, Kurdistan Region, Iraq
b
Department of Information Technology, Lebanese French University, Erbil, Kurdistan Region, Iraq
c
Civil Engineering Department, College of Engineering, Salahaddin University-Erbil, 44002 Erbil, Kurdistan Region, Iraq
d
Computer Science Department, College of Science and Technology, University of Human Development, Sulaymaniyah, Iraq

A R T I C L E I N F O A B S T R A C T

Keywords: This study aims to propose four Machine Learning methods of Gaussian process regression (GPR), support vector
Tunneling regression (SVR), decision trees (DT), and K-nearest neighbors (KNN) to predict disc cutter’s life of TBM. 200
Tunnel boring machine (TBM) datasets monitored during the Alborz service tunnel construction in Iran, including TBM operational parameters,
Machine learning (ML)
geometry, and geological conditions, were applied in the models. The 5-fold cross-validation method was
TBM disc cutter life
considered to investigate the prediction performance of the models. Finally, the GPR model with R2 = 0.8866/
RMSE = 107.3554, was the most accurate model to predict TBM disc cutter’s life. KNN model with R2 = 0.1753/
RMSE = 288.9277, produced the minimum accuracy. To assess each parameter’s contribution in the prediction
problem, the backward selection method was used. The results showed that TF, RPM, PR, and Qc parameters
significantly contribute to TBM disc cutter’s life. However, RPM and PR parameters were more and less sig­
nificant compared to the others.

1. Introduction disc cutter life in different conditions can be essential. Several prediction
models or adjustment factors for estimating the TBM disc cutter’s life
There are various applications of the full-face rock tunnel boring and disc cutters wearing have been developed in some researches [3,4].
machine (TBM) due to the rapid development in national construction All the studies performed on disc cutter wear and disc cutter life
and underground engineering technology. Since one of the most sig­ prediction can be categorized into two rough groups. In some studies,
nificant parts of TBM is considered to be the TBM disc cutter, it has been the disc cutter wear can be predicted from the mechanical computation
a subject of interest of several researchers [1–3]. One of the most general of the interaction between the rocks and cutters. Plinninger et al. [5]
topics among the TBM disc cutter researches is the disc cutter life and investigated the Cerchar Abrasivity Index (CAI) index based on the
the disc cutter wearing. It has a significant value practically and experimental conditions and rock mass properties. Michalakopoulos
economically in the tunneling process. A disc cutter is generally used as et al. [6] worked on the CAI index considering the effect of steel styli.
a rolling rock-breaking tool on a hard rock TBM cutterhead. There is a Other studies have differently predicted the disc cutter life by gaining
direct contact between the disc cutter and hard rock in the TBM cut­ the statistical theory between the rock condition, cutter life, and TBM
terhead working process. The disc cutters roll, and due to the cutterhead behavior. Hassanpour [7] and Liu et al. [8] established a mathematical
thrust and torque action, it can grind the hard rock. equation for predicting TBM disc cutter life by performing single and
In the last two decades, the use of rock TBM in tunneling projects has multiple regression analyses. In their study, several rock properties,
remarkably increased worldwide. Hence, the accurate prediction of TBM including UCS, CAI, quartz content, Vicker’s hardness number of rock

* Corresponding author.
E-mail addresses: [email protected] (A. Mahmoodzadeh), [email protected] (M. Mohammadi), [email protected] (H. Hashim
Ibrahim), [email protected] (S. Nariman Abdulhamid), [email protected] (H. Farid Hama Ali), [email protected] (A. Mohammed
Hasan), [email protected] (M. Khishe), [email protected] (H. Mahmud).

https://doi.org/10.1016/j.autcon.2021.103779
Received 18 January 2021; Received in revised form 18 April 2021; Accepted 19 May 2021
Available online 24 May 2021
0926-5805/© 2021 Elsevier B.V. All rights reserved.
A. Mahmoodzadeh et al. Automation in Construction 128 (2021) 103779

been specified. Also, to determine the most influential parameters


Database
considered on disc cutters’ lives, a backward selection method has been
used.
The overall flowchart of the study is presented in Fig. 1.
Specific energy Quartz content Excavation depth

Thrust force Cutter rotation speed Penetration rate Screw rate Despite the merits of various ML approaches, according to the No-
Free-Lunch (NFL) theorem, there is no ML model to solve all engineer­
Grouting pressure Soil pressure Disc cutter life ing problems as the best method successfully. Therefore, researchers
have tried to evaluate the efficiency of various ML approaches for
solving various optimization. As an NFL theorem, we use four ML
K-fold CV (K=5)
models with different features and capability, including KNN, GPR, SVR,
Training set Testing set and DT. However, the key features of the models mentioned above,
which motivate us to use them, is as follows:

• Regression Analysis

Data normalization
Regression analysis is used to predict a continuous target variable
from one or multiple independent variables. Typically, regression
analysis is used with naturally occurring variables rather than variables
AI algorithms
that have been manipulated through experimentation. As stated above,
GPR SVR DT KNN there are many different types of regression, so once we’ve decided
regression analysis should be used, how do we choose which regression
technique should be applied?
Statistical evaluation indices
R2 MAE RMSE MAPE
• We chose GPR because:
- GPR directly captures the model uncertainty. For example, in
regression, GPR directly distributes the prediction value, rather than
Results comparison just one value as the prediction. This uncertainty is not directly
captured in neural networks.
Identify the best prediction model
- When using GPR, we can add prior knowledge and specifications
about the shape of the model by selecting different kernel functions.
Feature selection For example, based on the answers to the following questions, we
may choose different priors. Is the model smooth? Is it sparse?
Identify the most effective features on the slope stability
Should it be able to change drastically? Should it be differentiable?
This capability gives researchers flexible models, which can be fitted
to various kinds of datasets.
Fig. 1. Overall procedure of TBM disc cutter life prediction using
ML techniques. • We chose SVR because:

SVR is characterized by using kernels, sparse solution, and Vapnik-


(VHNR), joint count number (Jv) and rock abrasivity index (RAI) were
Chervonenkis (VC) control of the margin and the number of support
comprised.
vectors. One of the main advantages of SVR is that its computational
Recently, soft-computing artificial intelligence (AI) techniques such
complexity does not depend on the dimensionality of the input space. It
as regression, classification, optimization, and group method of data
performs lower computation compared to other regression techniques.
handling (GMDH)-type neural networks (NN) have been successfully
Additionally, it has excellent generalization capability, with high pre­
used in a wide range of engineering problems [9–27]. However, AI
diction accuracy and is robust to outliers.
techniques have not yet been widely used to predict TBM disk cutter’s
life. Recently, Elbaz et al. [28] applied a genetic algorithm to predict the
• We chose DT because:
disc cutter life during shield tunneling. Consequently, their proposed
model was very efficient in providing good accuracy in predicting the
DTs are a type of supervision learning algorithm that repeatedly
TBM disc cutter life. The sensitivity analysis revealed that the penetra­
splits the sample based on certain questions about the sample. These are
tion rate significantly influences disc cutter life. Since there are so many
very useful for prediction problems. They are relatively easy to under­
different AI algorithms, it is not wrong to consider other algorithms’
stand and very effective. DTs represent several decisions followed by
ability, and it might be appropriate to predict the TBM disk cutter’s life.
different chances of occurrence. This technique helps us to define the
As several parameters can affect the TBM disk cutter’s life, it can also be
most significant variables and the relation between two or more vari­
very important to use various data, including different parameters in the
ables. In our problem, we have various variables related to each other, so
prediction algorithms, to identify the most influential parameters on the
we select DT as one of comparing models. In other words, a significant
disk cutters’ lives.
advantage of a decision tree is that it forces the consideration of all
This study aims to predict TBM disc cutter’s life using four ML
possible outcomes of a decision and traces each path to a conclusion. It
techniques of Gaussian process regression (GPR), support vector
creates a comprehensive analysis of the consequences along each branch
regression (SVR), decision trees (DT), and K-nearest neighbors (KNN). A
and identifies decision nodes that need.
database including 200 datasets measured during hard rock TBM
Key advantages:
tunneling of Alborz service tunnel in Iran is applied in the models. The
monitoring data includes geology conditions, geometry, and operational
- No preprocessing needed on data.
parameters of the TBM. The 5-fold cross-validation (CV) method is used
- No assumptions on the distribution of data.
to investigate the prediction performance of the models. The field ob­
- Handles collinearity efficiently.
servations are compared to the values of the disc cutter’s life forecasted
- DT can provide an understandable explanation for the prediction.
using the four ML algorithms. Finally, the best prediction model has
• We chose KNN because:

2
A. Mahmoodzadeh et al. Automation in Construction 128 (2021) 103779

Fig. 2. Project location of the Alborz tunnel.

KNN is a non-parametric method we used for prediction in this - K value: how many neighbors participate in the KNN algorithm. k
paper. It is one of the easiest ML approaches that has been recently used. should be tuned based on the validation error.
It is a lazy learning model with local approximation. We use this model - Distance function: Euclidean distance is the most used similarity
considering the following terms: function. Manhattan distance, Hamming Distance, Minkowski dis­
The key Advantages: tance are different alternatives.

- Easy and simple machine learning model. Assumptions:


- Few hyperparameters to tune.
- There should be a clear understanding of the input domain.
Hyperparameters: - Feasibly moderate sample size (due to space and time constraints).
- Collinearity and outliers should be treated before training.
- KNN mainly involves two hyperparameters, K value & distance
function. Comparison with other models:

Fig. 3. Schematic of Alborz twin tunnel [29].

3
A. Mahmoodzadeh et al. Automation in Construction 128 (2021) 103779

Fig. 4. Geological map of Alborz service tunnel.

Fig. 5. Open gripper hard rock TBM used for the excavation of Alborz ser­ Fig. 6. Structure of a TBM disc cutter [31].
vice tunnel.
completed. The tunnel entrance is considered as the northern mouth (to
- KNN and other models’ general difference is the large real-time the Shomal-S), and its outlet is considered the southern mouth (to the
computation needed by KNN. Tehran-T). The tunnel route’s Lithology is mainly composed of Tuffs,
Andesite, Anidrite, Limestone, and Sandstone. The compressive strength
2. Case description of the rocks of the tunnel route varies from 20 to 120Mpa. The longest
fault is located in the ST5339–5361, where the water flowing into the
The Alborz service tunnel on Tehran–Shomal motorway project in tunnel is high from this fault and provides conditions for squeezing the
Iran is considered in this study to access the database. The Tehran- rocks of the tunnel pathway. The longitudinal geological map of the
Shomal motorway project is a new motorway through which the capi­ Alborz service tunnel is shown in Fig. 4.
tal Tehran is connected to the city of Chalus at the Caspian Sea in the An open gripper hard rock TBM manufactured by Wirth with 5.2 m
North with a length of 121 km. At present, traffic crosses the Alborz diameter (Fig. 5) is used to excavate the Alborz service tunnel with a
mountains on narrow highways, and it takes 5–6 h to travel. Once the constant positive gradient (~1%). The maximum overburden occurs
project is finished, travel time with an average higher volume can be across the length of 850 m. The first excavation step came to pass on 06
shortened under two hours. There are more than 30 twin tunnels with Sep 2004 during the erection and commissioning of the TBM. Productive
dual lanes in the motorway alignment. With a length of 6400 m and an excavation started on 06 Feb 2005 at TM 122. The break-through into
altitude of 2400 m, the Alborz tunnel would be the longest. The location the S-portal heading was celebrated at TM 6073 on 03 Feb 2009 after 48
of the Alborz service tunnel is shown in Fig. 2. months. The excavations were carried out for 919 days (63%) out of
There is a service tunnel situated between the main existing tunnel 1459 days; thus, an average total of 6.48 m per day in advance was
tubes. This tunnel’s primary role is to investigate, drain, and reach the observed during days. The advance was maximal 30.47 m per day,
utility to the main tunnels. A schematic of the Alborz twin tunnel is 110.96 m per week, 389.43 m per month.
shown in Fig. 3. Currently, the construction of the Alborz tunnel is

4
A. Mahmoodzadeh et al. Automation in Construction 128 (2021) 103779

Fig. 7. Different forms of TBM disc cutter wear. (a) Normal wear, (b) edge curling, (c) cutter ring partial wear, (d) cutter ring fracture, (e) cutter ring crack, (f) seal
and bearing failure [30].

Fig. 8. (a) Number of disc cutters replacement for all disc cutter positions on the cutterhead, (b) Pie chart of each normal and abnormal wear form.

3. Analysis of TBM disc cutter wear and disc cutter life for all disc cutter positions. As shown in Fig. 8(b), 66.35% of disc cutters
wear is reported as normal, and the rest (33.65%) as abnormal. Ac­
A TBM disc cutter is made up of various components, each of which cording to Fig. 8(a), the further step away from the center of the cut­
has a specific task. The structure of a TBM disc cutter is shown in Fig. 6. terhead to the edges, the more the discs are worn and the more they are
TBM disc cutters wear normally or abnormally. In normal wear, the replaced.
entire disc cutter ring wears out almost evenly. On the other hand, there A disc cutter’s life is the amount of time it takes to use it before it
can be edge curling, partial ring wear, ring fracture, ring crack, and seal needs to be replaced. Looking at the previous publications [28], Eqs.
and bearing failure in abnormal disc cutter wear [28,30]. The different (1)–(3) are employed as three methods to predict a disc cutter life.
forms of TBM disc cutter wear are shown in Fig. 7. there should be
L
immediate disc replacement; if abnormal wear is observed on the disc, Hm = (1)
NTBM
the disc needs to be replaced immediately.
In the Alborz service tunnel, the disc cutter position in the cutterhead NTBM
was numbered so that the influence of disc cutters position on their life Wm = (2)
L
can be determined as in Fig. 5. During the excavation of the Alborz
service tunnel, 214 disc cutters were replaced. In Fig. 8(a), the overall Hm πd2TBM
Hf = (3)
number of disc cutter changes in the Alborz service tunnel is presented 4

5
A. Mahmoodzadeh et al. Automation in Construction 128 (2021) 103779

Fig. 9. Data distribution and correlation of the TF, RPM, SR, PR, GP, SP, SE, Qc, and H parameters to the output parameter of Hf.

Table 1
A brief review on the database used in this study.
TF [kN] RPM [rev/min] SR [rev/min] PR [mm/rev] GP [kPa] SP [kPa] SE [kWh/m3] Qc [%] H [m] Hf [m3/cutter]

count 200 200 200 200 200 200 200 200 200 200
mean 30,940 1.6040 10.2680 28.1970 327.450 190.900 3.9733 9.6102 395.404 1896.410
std 6461 0.1961 4.6916 7.2329 63.3784 39.1260 1.4052 6.7656 97.8027 308.4626
min 19,000 1.2000 2.6000 14.000 210.000 120.000 1.3700 0.0000 96.6000 790.0000
25% 25,950 1.5000 5.5000 22.000 270.000 160.000 2.9975 4.1750 325.450 1759.500
50% 30,700 1.6000 11.000 28.000 330.000 185.000 3.7200 7.2000 381.800 1934.000
75% 35,250 1.8000 14.200 33.000 370.000 220.000 4.8425 13.775 433.550 2054.750
max 45,600 1.9000 19.600 45.000 470.000 280.000 8.4600 26.300 811.900 2850.000

where Hm is the average length of tunnel bored in m/cutter, Hf is the appropriate parameter in several projects to estimate the TBM disc
volume of rock excavated for each cutter change in m3/cutter, Wm is the cutter life [7,28]. Therefore, Hf is considered in this study to predict the
number of cutters changed per rolling distance of excavated soil in disc cutters life of the TBM used in the Alborz service tunnel.
cutter/m, NTBM is the total number of disc cutters changed, L is tunnel
length excavated for each full dressing of the head, and d is the tunnel 4. Database
diameter.
Among the three above equations, Hf has been identified as the most In this study, a database including 200 datasets obtained during the

6
A. Mahmoodzadeh et al. Automation in Construction 128 (2021) 103779

Alborz service tunnel construction is employed. Three primary sources Table 2


provided the database. First, a built-in data acquisition system in the The optimized hyper-parameters of the GPR model.
TBM has provided the TBM operational parameters. Remainders, Parameter Value or type
geological maps and engineering, and geotechnical reports based on
Kernel Function ‘Matern 5/2’
borehole logs’ information and visible exposure from the rock bed Basis Function ‘Constant’
provided geological conditions and soil geometry. Beta 1818.1
In Fig. 9, the correlation between each input and output is shown. As Sigma 3.1458
shown in Fig. 9, there is not any strong correlation between the inputs Fit Method Exact Gaussian process regression

and the output. This shows that the output value cannot be obtained
with one of the parameters alone. In this case, it is necessary to consider
all the parameters affecting the model’s output simultaneously. Table 3
Table 1 provides a brief review of the data used. The observed data The optimized hyper-parameters of the SVR model.
includes geology conditions and operational parameters of the TBM, Parameter Value or type
including thrust force (TF), excavation depth (H), soil pressure (SP), Kernel Function ‘Medium Gaussian’
cutter rotation speed (RPM), disc cutter life (Hf), grouting pressure (GP), Epsilon 21.9681
quartz content (Qc), penetration rate (PR), specific energy (SE), and Solver ‘SMO’
screw rate (SR). Bias 1835.1
In order to employ the database in the prediction models, the K-fold
CV (K = 5) was used to categorize datasets into two groups of training
Note that each Fxi is barely Gaussian, with mean μ(xi) and difference k
and testing. To obtain robust outcomes from the datasets’ analysis, they
(xi, xi) [22].
randomly separated equally into two equal-sized portions (i.e., K and K1
Assume there is an f(x) that can update. Besides, suppose that f
sub-samples). To validate and test the models, the K sub-samples and the
cannot be watched legitimately, yet that an arbitrary variable Fx can be
K1 sub-samples were employed, respectively.
seen that is listed by a similar space as f and whose normal esteem is f, i.
e., ∀x ∈ X, E[Fx] = f(x). It is agreed that previous convictions of the
5. Statistical evaluation indices capacity f are associated with an early mean μ and part k Gaussian
method. Assume that Fx is a perception of f(x) that has been tainted by
To evaluate the accuracy of the forecasting models, some statistical zero-mean, i.i.d. Gaussian clamor, i.e., Fx = f(x) + ϵ, where ϵ~N(0, σ2ϵ ).
evaluation indices, including coefficient of determination (R2), root Then, f(x) is a shrouded vector, the back appropriation of which can be
mean square error (RMSE) and mean absolute percentage error (MAPE) derived in the wake of observing Fx experiments in various space areas.
are taken into account. In the following, the formulas for calculating The subsequent deduction is called Gaussian procedure relapse [23].
these indices are presented. Let us consider x to be the arrangement of perceptions focuses and Fx
sum squared regression (SSR) be the subsequent genuine esteemed perceptions. The back appropria­
R2 = 1 − (4) tion of some new point ̂ x ∈ X needs to be processed. The appropriation
sum of squares total (SST)
will be Gaussian with mean and difference,
n ⃒ ⃒
1∑ ⃒yi − y′i ⃒
MAPE = ⃒

⃒ × 100% (5) μ(̂x |x) = μ(̂x ) + k(̂x , x)k(x, x)− 1 (Fx − μ(x) ) (8)
n i=1 yi ⃒

√̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
( )∑ ̅ σ 2 (̂x |x ) = k(̂x , ̂x ) − k(̂x , x)k(x, x)− 1 k(x, ̂x ) (9)
1 n
(6) Note that the backward application deceives the portion network of
′ 2
RMSE = (yi − yi )
n i=1
watched area focuses, thus can be figured once and used to assess the
back at numerous focuses in the space. Since the issue is to find an ideal
where yi is the actual value, yi′ is the predicted value, yi and yi are the

of the obscure capacity, the final step is to process the ideal of the
means of actual and predicted values, and n is the number of samples. subsequent back mean xR = argmax̂x ∈X μ( ̂ x |x ). This, can not be pro­
cessed in a shut structure and demands employing some strategy for
6. Prediction models of disc cutter life and results capacity streamlining. Although not ideal, another strategy is to restore
the xi from x with the biggest watched esteem Fxi. This has the functional
6.1. GPR reaction that the method will not restore a point that has never been
assessed, including some insurance from an inaccurate earlier.
A Gaussian procedure (GP) is a set F of arbitrary factors Fx1, Fx2, … for In this work, the regression learner app of Matlab 2019 software was
which any finite subset of the factors has a joint multivariate Gaussian employed to predict disc cutter life using the GPR method. In this pre­
conveyance. The factors are listed by components x of a set X. For any diction, the GPR method utilizing the Matlab app tested four models,
finite length vector of lists x = [x1, x2, …, xn]T, there is a comparing including exponential, squared exponential, rational quadratic and
vector Fx = [Fx1, Fx2, …, Fxn]T of factors that has a multivariate Gaussian Matren 5/2, separately. At the end of the process, the most accurate
(or ordinary) distribution [21], model was selected. Each of the four models embraces a wide range of
Fx ∼ N{μ(x) , k(x, x) } (7) hyper-parameters; in turn, these hyper-parameters’ values evaluate the
models’ performance. Therefore, the optimization mode in the app of
where μ(x) is provided by a mean capacity μ(xi), and k is the portion Matlab has been activated to emerge the outcomes from the models.
work. The portion takes two files xi and xj, and gives the covariance Table 2 shows the selected parameters of the most powerful GPR model
between their comparing factors Fxi and Fxj. Given vectors of lists xi adopted in this study.
andxj, k restores the framework of covariances between all sets of factors Three primary hyperparameter tuning approaches have been pro­
where the first in the pair originates from Fxi and the second from Fxj. posed in literature, including random search, grid search, metaheuristic-

7
A. Mahmoodzadeh et al. Automation in Construction 128 (2021) 103779

Fig. 10. Comparison of the disc cutters life predicted by the GPR model with the actual ones.

based search (smart search) [32,33] although there are also other ap­
proaches that are less popular [34,35]. Contrary to the “dumb” alter­
natives of grid search and random search, metaheuristic-based
hyperparameter tuning, including PSO, is much less parallelizable.
Instead of producing all the candidate points up front and investigating
the batch in parallel, metaheuristic-based tuning approaches pick a few
hyperparameter settings, investigate their performance, then determine
where to sample next. These methods are intrinsically iterative and
sequential process, which is not parallelizable. On one hand, making
fewer evaluations and reducing the total time complexity are the pri­
mary goal of any computation algorithm. On the other hand,
metaheuristic-based search algorithms require computation time to find
out where to place the next set of samples. Besides, metaheuristic-based
search algorithms also contain parameters of their own that need to be
tuned. These shortcomings motivate us to choose random search, which
has the least time and space complexity.
Considering the Bergstra theorem [32] that stated that “if the close-
to-optimal region of hyperparameters occupies at least 5% of the grid
surface, then random search with 60 trials will find that region with high
probability”, the hyperparameters of the GPR and SVR methods, which
are shown in Tables 2 and 3, were optimized with a random search
Fig. 11. Disc cutters life predicted by the GPR model vs. the actual mode.
approach as follows: the values of hyperparameter are modeled with an
exponential probability density function, which produces values for
each hyperparameter to be investigated according to the models’ per­
formance in the validation set. After 60 iterations with differently

Fig. 12. Comparison of the disc cutters life predicted by the SVR model with the actual ones.

8
A. Mahmoodzadeh et al. Automation in Construction 128 (2021) 103779

Table 4
The optimal hyper-parameters considered in the DT
method.
Parameter Value or type

PredictorSelection ‘allsplits’
SplitCriterion ‘mse’
Prune ‘on’
MaxNumSplits 199
MinLeafSize 4
MinParentSize 10

parameters, however, the overall number of candidate models in grid-


search can easily become quite large. For example, considering ten
candidate values for each parameter, we need to test 10*10*10 candi­
date SVR models. Add cross-validation on top of that and the number of
models to build and assess gets pretty big. A number of papers suggest
evolutionary algorithms or, more generally, methods from the heuristic
search family as a more elaborate and potentially more efficient alter­
native to grid search.
Predicted values of the disc cutter life by SVR model were generated
Fig. 13. Disc cutters life predicted by the SVR model vs. the actual mode.
using Matlab software 2019. Various SVR models can be coded in the
app, including linear, cubic, fine Gaussian, medium Gaussian, coarse
produced values from the probability functions, the hyperparameters Gaussian, and quadratic. These models were employed to predict the
that generated the highest accuracy are selected and stored. The optimal best values of the cutter life. For this reason, the values were optimized
hyperparameters are chosen based on the median of the hyper­ using the optimization mode in Matlab 2019. Table 3 shows the selected
parameters’ values calculated for each fold. Finally, the 5-fold cross- parameters of the most powerful SVR model adopted in this study.
validation is repeated to guarantee that all test instances are investi­ Fig. 12 shows the predictions for the life of all disc cutters by the SVR
gated with the optimal hyperparameters, which leads to more stable model. Compared to the actual life of disc cutters, there is acceptable
practical and results. accuracy in the SVR model predictions. As it turns out from Fig. 12, the
The 5-fold CV results of the disc cutters life by the GPR model are predictions’ errors are often less than 150 m3/cutter, so the mean ab­
shown in Fig. 10. As can be seen from Fig. 10, the disc cutters’ actual life solute error value is equal to 102 m3/cutter compared to the actual case.
is very close to the values predicted by the GPR model. According to Compared to a disc cutter’s average life, this error amount can be
Fig. 10, the predicted error for most disc cutters is less than 100, so that acceptable and significant in the predictions. Compared to the actual
the mean absolute error value is about 72.1 m3/cutter. Such an error in state, other statistical evaluation indices of R2, RMSE, and MAPE are
predicting a disc cutter’s life is not so great and indicates the GPR obtained by 0.7889 (Fig. 13), 159.0733, and 5.632262%, respectively.
model’s good prediction accuracy. To more support these results, other These results suggest that the SVR model should be considered a
statistical evaluation indices of R2, RMSE, and MAPE are obtained equal somewhat appropriate predictive model for predicting TBM disc cutte’s
to 0.8866 (Fig. 11), 107.3554, and 4.018355%, respectively. All the life.
results confirm the high prediction accuracy of the GPR model.
6.3. DT

6.2. SVR
DT is one of the classifications and regression methods based on the
non-parametric survived learning technique. Furthermore, it consists of
SVR preserves all of the Support Vector Machine (SVM) standard
a set of if-then-else decision rules. The best perdition of the model occurs
algorithm’s critical features. The model is generated through classifi­
when the DT goes deeper and deeper to make the best fit with the actual
cation with SVM. Consequently, SVM concepts for classification are
data. There are several advantages of the DT. First, the distribution of
similarly used for SVR, but few minor variations allow the algorithm to
explanatory variables does not require assumption. Second, strong re­
be used as an efficient tool in evaluating real value functions. The SVR
lations among independent variables do not affect the DT outcomes.
provides flexibility to clarify how much error can be tolerated and de­
Third, various dependent variables such as survived data, categorical
fines an appropriate line or hyperplane for data-fitting in higher di­
and numerical can be covered by DT. Fourth, this technique comprises
mensions. It is also defined by regulating the number of support vectors
the powerful variables and eliminates the least powerful variables which
and margin using the sparse solution, kernels, and Vapnik-Chervonenkis
describe the dependent variable. For the DT, it is possible to predict
theory (VC). Although SVR is not as wide-ranging as SVM, it has yet been
small and large datasets well, even though this technique was initially
extended to many research fields, including but not limited to; control
developed to predict large data only [14].
systems, bioinformatics, electric loads and consumption, customer de­
The algorithm of DT can be explained as follow:
mand, finance, tourism demand, air quality, prices in the market, and
flood control.
1. First, the calculation of the targeted variance is performed.
Grid-search with cross-validation is probably the most common
2. Based on the various attributes, the database is divided into distinct
approach to tuning SVR models. However, we must pay attention to the
parts, and the variance of each sectioned part is deducted from the
time ordering of our data. For example, a sliding-window cross-valida­
variance before the division. This can be defined as variance
tion accounts for data ordering, whereas standard cross-validation does
reduction.
not, which might be inappropriate for our data.
Coming back to SVR parameters, given that we typically need to tune

9
A. Mahmoodzadeh et al. Automation in Construction 128 (2021) 103779

Fig. 14. Comparison of the disc cutters life predicted by the DT model with the actual ones.

S is a group of samples that is not separated yet, St is a group of


separated samples with true result and Sf is a group of separated samples
with a false result. Each of the above summands is indeed variance es­
timates, though written in a form without directly referring to the mean.
In each summation term in Eq. (10), variance estimation is required, so
the mean is not referred to directly.
The decided node of the attribute is based on the highest VR.

3. Depending on the values of selected attributes, the datasets are


separated. If the variance of a part is more than zero, it is separated
once more.
4. Keep another trial going until all the data is evaluated.

Predicted values of the disc cutter life by DT model were generated


using Matlab software 2019. Various DT models can be coded in the app,
including fine, medium and coarse trees. Among the models used, the
one has been chosen, in this study, which predicts the best values of the
cutter life. Due to this reason, the values of each model were optimized
using the optimization mode in Matlab 2019. Table 4 shows the selected
parameters of the most powerful DT model considered in this study.
Fig. 15. Disc cutters life predicted by the DT model vs. the actual mode. In Fig. 14, the results predicted by the DT model are compared with
the actual mode. As shown in Fig. 14, the difference between the pre­
dicted results and the measured values is somewhat large. The mean
Node N can be defined by the variance reduction as:
absolute error in predictions, in this case, is 186 m3/cutter. Achieving
( )
1 ∑∑1( )2 1 ∑∑1( )2 1 ∑∑1( )2 such an error in predicting a disc cutter’s life is not too bad, but it is also
IV (N)= 2
|S| i∈S j∈S 2
xi − xj − ⃒ ⃒2
⃒Sf ⃒ i∈Sf j∈Sf 2
xi − xj + 2
|St | i∈St j∈St 2
xi − xj not very acceptable. Other statistical evaluation indices of R2 = 0.4005
(Fig. 15), RMSE = 246.9794, and MAPE = 10.47571% also indicate
(10) insufficient accuracy in the DT model predictions.

Fig. 16. Comparison of the disc cutters life predicted by the KNN model with the actual ones.

10
A. Mahmoodzadeh et al. Automation in Construction 128 (2021) 103779

classification as the most frequent label or average the labels, respec­


tively. In both cases, the right K is chosen, which gives the minimum
errors by feeding different K values when the KNN algorithm runs
several times [16].
In the current work, the KNN was applied using Anaconda version
3.6. Considering the time complexity of GridSearch, we tried to use some
research [36–38] to select the potential values for K and then try to
examine these potential values to choose the best one. In this regard, the
values 2, 3, 4, 7, and 9 were examined to choose the optimal value. After
careful consideration the best value of K was 4 to give the best prediction
of the disc cutter life. However, the fine-tuning of KNN’s parameters and
hyperparameters is an excellent idea for future work; so, we propose it as
one of the future research directions in the conclusion section.
For the KNN model, as shown in Fig. 16, the difference in the pre­
dicted values with the actual mode is large. Most disc cutters’ prediction
error is high, with a mean absolute error of 207 m3/cutter. The results of
other statistical evaluation indices of R2 = 0.1753 (Fig. 17), RMSE =
288.9277, and MAPE = 11.4697% also show the low accuracy of the
KNN model in predicting the TBM disc cutters life. Therefore, consid­
ering the database employed in this paper, the KNN model is insignifi­
cant and negligible in predicting the TBM disc cutter life.
Fig. 17. Disc cutters life predicted by the KNN model vs. the actual mode. To determine the best prediction model among the four ones used in
this paper to predict TBM disc cutters’ life, in Fig. 18 and Table 5, a
6.4. KNN comparison between the results predicted by them has been made. By

It is possible to use KNN for classification or regression as a machine Table 5


learning algorithm. The understanding and implementation of this al­ Comparison among the results produced by the ML models.
gorithm are simple; however, the method’s primary shortcoming is that Prediction model R2 RMSE MAPE [%]
the greater the amount of the information or data, the slower the pro­
GPR 0.8866 107.3554 4.018355
cessing occurs.
SVR 0.7889 159.0733 5.632262
Based on the determined distances between a query and the entire DT 0.4005 246.9794 10.47571
data, KNN algorithm acts. Furthermore, several samples of K, which are KNN 0.1753 288.9277 11.46970
nearest to the query, are selected to be used in regression and

Fig. 18. Comparison between the R2 of the prediction models.

11
A. Mahmoodzadeh et al. Automation in Construction 128 (2021) 103779

Table 6 datasets with different sizes. The size of datasets and the model’s per­
R2 and RMSE of the simulated outputs. formances are tabulated in Table 6.
Predictive The training Datasets for R2 RMSE Both GPR and SVR are memory-based methods that store a part or
model datasets simulation the entire training data for testing. Therefore, their training is generally
Model I 100 40 0.6944 183.9310 fast and they can improve the efficiency of the massive-training meth­
Model II 130 40 0.7583 146.7382 odology. GPR approaches nonlinear regression from a Bayesian
Model III 160 40 0.8866 107.3554 perspective. The Bayesian paradigm provides probabilistic modeling of
nonlinear regression. The Bayesian approach to regression specifies a
priori probability of the parameters to be estimated and it computes the
analyzing and comparing the values of the obtained statistical evalua­
maximum a posteriori probability given the observed data samples.
tion indices for each model, it can be concluded that the highest accu­
Contrary to non-Bayesian schemes where some criterion typically
racy and the lowest accuracy are provided by the GPR and KNN models,
chooses a single parameter, the Bayesian probabilistic model produces
respectively. Therefore, the most acceptable results, which are not very
both the optimal estimated function and the covariance associated with
different from the actual ones, are provided by the GPR model. After the
the estimation. Therefore, the Bayesian paradigm offers more informa­
GPR model, the highest prediction accuracy is generated by the SVR
tion on the estimated parameters than does the non-Bayesian method­
model.
ology. On the other hand, rooted in a maximum margin property, SVR
offers excellent generalization ability and robustness to outliers. Both
7. Discussion SVR and GPR are kernel-based nonlinear regression techniques. A kernel
or a covariance function is used to implicitly transform the original
Three primary hyperparameter tuning approaches have been pro­ image data into a high-dimensional reproducing kernel Hilbert space.
posed in literature. However, including random search, grid search, Therefore, both SVR and GPR, as the state-of-the-art nonlinear regres­
metaheuristic-based search (smart search) although there are also other sion models, can offer a performance comparable or potentially superior
less popular approaches. Contrary to the “dumb” alternatives of grid to ML models.
search and random search, metaheuristic-based hyperparameter tuning, Although DT does not require normalization and rescaling of data, a
including PSO, is much less parallelizable. Instead of producing all the small change in the data can cause a large change in the decision tree
candidate points up front and investigating the batch in parallel, structure causing instability.
metaheuristic-based tuning approaches pick a few hyperparameter set­ KNN can be very sensitive to the scale of data as it relies on
tings, investigate their performance, then determine where to sample computing the distances. The calculated distances can be very high for
next. These methods are intrinsically iterative and sequential process, features with a higher scale and might produce poor results. Herein,
which is not parallelizable. On one hand, making fewer evaluations and although the data have been normalized, due to the large gap between
reducing the total time complexity are the primary goal of any compu­ the scale of features, KNN provides poor results.
tation algorithm. On the other hand, metaheuristic-based search algo­ The normalization technique was applied in this study. Since the
rithms require computation time to determine where to place the next data parameters have a wide range of values, this particular feature will
set of samples. Besides, metaheuristic-based search algorithms also govern the computed distance. This is the reason, why the range of all
contain parameters of their own that need to be tuned. These short­ features should be normalized (scaled) so that each feature will have
comings motivate us to choose random search, which has the least time values in same range.
and space complexity. In this study, the generalization of the suggested Gaussian process
In this article, considering the Bergstra theorem that stated that “if regression methodology is discussed. Generalization is a concept used to
the close-to-optimal region of hyperparameters occupies at least 5% of characterize the model’s ability to interact and adapt to new informa­
the grid surface, then random search with 60 trials will find that region tion. Therefore, a model can ingest novel data and predict accurately
with high probability”, the hyperparameters of the GPR and SVR after practicing with data not used during training. The basis for a
methods, were optimized with a random search approach as follows: the model’s success and its practical performance is related to its capability
values of hyperparameter are modeled with an exponential probability to generalize. When a model was so well trained in training data, it
density function, which produces values for each hyperparameter to be cannot be generalized. When new data are given, the model is rendered
investigated according to the models’ performance in the validation set. inaccurate predictions and worthless even if it can accurately predict the
After 60 iterations with differently produced values from the probability training data. A model starts ‘memorizing’ the training data instead of
functions, the hyperparameters that generated the highest accuracy are ‘learning’; this is known as overfitting.
selected and stored. The optimal hyperparameters are chosen based on Feature selection can be used to avoid the overfitting of the model. In
the median of the hyperparameters’ values calculated for each fold. this case, feature selection would minimize the number of features,
Finally, the 5-fold cross-validation was repeated to guarantee that all which decreases the computational complexity of the model. The step­
test instances are investigated with the optimal hyperparameters, which wise approach for choosing an important collection of features from the
leads to more stable practical and results. data sets is used for the full features available.
The performance of the proposed model was investigated by various

Table 7
First step of feature selection.
Estimate SE tStat p-value Significance code

(Intercept) 3520.6 221.73 15.878 1.09E-36 ***


TF − 0.015864 0.0027342 − 5.8021 2.70E-08 ***
RPM − 801.33 88.018 − 9.1042 1.20E-16 ***
SR 0.28762 3.741 0.076883 0.9388
PR 2.7351 2.4645 − 8.1098 2.6849 E-9 ***
GP 0.28228 0.27597 1.0229 0.30767
SP 0.15941 0.45069 0.3537 0.72395
SE 14.519 12.185 1.1916 0.23492
Qc − 11.177 2.5801 − 4.3321 2.39E-05 ***
H − 0.0031946 0.1774 − 0.018008 0.98565

12
A. Mahmoodzadeh et al. Automation in Construction 128 (2021) 103779

Table 8 0.05) are of the share of TF, RPM, PR, and Qc, respectively, as shown in
The second step of feature selection. Table 7. So these four parameters are chosen, and then step 2 is taken.
Estimate SE tStat p-value Significance The model now has only four predictors, which are TF, RPM, PR, and
code QC. It can be noticed that in Table 8 that the smallest range of features
(Intercept) 3717 171.16 21.717 6.02E- *** are {TF, RPM, PR, and Qc}, as shown in the same table that the highest
54 impact parameter for disk cutter life is the RPM parameter.
TF − 0.016222 0.0026838 − 6.0442 7.50E- *** In Table 9, the feature selection results made by the other ML models
09 of SVR, DT, and KNN are provided. As in Table 9, the smallest range of
RPM − 803.29 86.075 − 9.3325 2.35E- ***
17
features selected by all the models is {TF, RPM, PR, and Qc}, that the
PR 2.7451 2.4229 − 8.133 2.5861 *** highest impact parameter for disk cutter life is the RPM parameter.
E-11 Tables 8 and 9 show the similarity between the feature selection results
Qc − 11.202 2.5344 − 4.4199 1.64E- *** of all the models used in this paper.
05

8. Conclusions
Stepwise regression methods can be classified into three strategies:
The process of disc cutter wear is complicated with many influential
• The first strategy (forward selection), which begins without any factors. Cutter life is an important economic index for TBM excavation,
predictors in the model, relies on adding more iterative predictors. and its prediction is widely concerned. This study introduced a method
Simultaneously, it stops when the improvement in the results no for predicting the life of each cutter of the TBM based on the regression
longer has a statistically positive impact analysis on the cutter changing records during the excavation of the
Alborz service tunnel in Iran using four ML methods. 200 datasets
• The second strategy (backward selection), which begins with all monitored during the Alborz tunnel construction, including geology
predictors in the model, periodically eliminates the lowest contrib­ conditions, geometry, and operational parameters of the TBM were
utive predictors;. At the same time, it stops once you get a model, all applied in the models. Two software of MATLAB 2019 and Python were
its predictors become statistically meaningful. used to analyze the prediction models. In order to achieve more accurate
• The third strategy (stepwise selection) is a mixture of forwarding and predictions, the hyper-parameters of the ML models were optimized.
backward processes. It begins without any predictors and then adds The 5-fold CV method was applied to investigate the prediction effi­
the predictors that contribute most to the outcome sequentially (like ciency of the models. The prediction models’ validitywas examined by
backward selection). While adding every new variable, those that no comparing their predicted results with the monitored ones with
more enhance the model’s fit should be removed (like forwarding reasonable agreement.
selection). The results indicated that, the GPR model with R2 = 0.8866, RMSE =
107.3554, and MAPE = 4.018355%, was the most accurate model to
In this research, the stepACI [MASS Package] was used, which de­ predict TBM disc cutters life. KNN model with R2 = 0.1753, RMSE =
fines the best design by AIC. The model also has a choice called direc­ 288.9277, and MAPE = 11.46970%, produced the minimum accuracy.
tion, which takes these values: i) forward (for elimination from The backward selection method was used to assess the contribution
forwarding); ii) backward (for elimination from backward); iii) both of each parameter in the prediction problem. The results showed that
(sequential replacement, for forward and backward elimination). The four TF, RPM, PR, and Qc parameters significantly contribute to TBM
best-finished model is recovered. In R, among the most popular search disc cutter’s life. However, RPM and PR parameters were more and less
methods for selecting features is stepAIC. For the stepAIC model’s values significant compared to the others.
continuously to arrive at the final feature set are attempted to be It is suggested that in future works, the models presented in this work
reduced. In the tables below, the finding listed as follows, three asterisks be used to predict disc cutter life in other tunnels by using newer data
(*) reflect the highly significant value of p. Therefore, it could reject the with various input parameters, the most accurate algorithms be identi­
null hypothesis by providing a small p-value for the intercept and path fied and the most effective parameters on the TBM disc cutter life into
that enables us to create a good relationship between two measured the tunnels be specified. Also, given that there are a variety of geological
variables (the target and the predictor variables). A p-value around 5% and geotechnical issues that can be very important to predict, it is sug­
or less would be a good cut-off point for most situations. gested that the prediction models presented in this work be used to
During the first stage, they fitted the model to include all the pre­ predict them, and their ability to solve these problems be examined.
dictors and the target. The lowest values of p (which must be less than

Table 9
The final step of feature selection is made by other models.
Method Estimate SE tStat p-value Significance code

SVR (Intercept) 3.26E+03 9.90E+01 32.898 2.00E-16 ***


TF − 8.25E+00 1.27E+00 − 6.519 5.96E-10 ***
RPM − 6.63E+02 4.35E+01 − 15.22 2.00E− 16 ***
PR -1.15E-02 1.33E-03 − 8.657 1.86E-15 ***
Qc 1.56E+01 6.10E+00 2.553 0.0114 *
DT (Intercept) 3.73E+03 1.36E+02 27.482 2.00E-16 ***
TF − 6.38E+00 1.97E+00 − 3.242 0.00139 **
RPM − 7.86E+02 6.76E+01 − 11.629 2.00E− 16 ***
PR -1.31E-02 2.06E-03 − 6.353 1.46E-09 ***
Qc − 2.90E-01 1.36E-01 − 2.133 3.42E-02 **
KNN (Intercept) 1.99E+03 8.40E+01 23.728 2.00E− 16 ***
TF 5.91E-01 1.65E-01 3.588 0.000422 ***
RPM -1.16E-02 1.62E-03 − 7.155 1.67E− 11 ***
PR -1.04E+01 1.53E+00 − 6.807 1.21E-10 ***
Qc 2.04E+01 7.38E+00 2.762 6.30E-03 **

13
A. Mahmoodzadeh et al. Automation in Construction 128 (2021) 103779

Declaration of Competing Interest [19] J. Dalong, S. Zhichao, Y. Dajun, Effect of spatial variability on disc cutters failure
during TBM tunneling in hard rock, Rock Mech. Rock. Eng. 53 (2020) 4609–4621,
https://doi.org/10.1007/s00603-020-02192-2.
There is no conflict of interest. [20] A. Mahmoodzadeh, M. Mohammadi, S.N. Abdulhamid, H.H. Ibrahim, H.F. Hama
Ali, S.G. Salim, Dynamic reduction of time and cost uncertainties in tunneling
References projects, Tunn. Undergr. Space Technol. 109 (2021) 103774, https://doi.org/
10.1016/j.tust.2020.103774.
[21] A. Mahmoodzadeh, M. Mohammadi, H.H. Ibrahim, S.N. Abdulhamid, S.G. Salim,
[1] Q. Tan, L. Yi, Y.M. Xia, Performance prediction of TBM disc cutting on marble rock H.F. Hama Ali, M.K. Majeed, Artificial intelligence forecasting models of uniaxial
under different load cases, KSCE J. Civ. Eng. 22 (2018) 1466–1472, https://doi. compressive strength, Transport. Geotech. 27 (2021) 100499, https://doi.org/
org/10.1007/s12205-017-1048-1. 10.1016/j.trgeo.2020.100499.
[2] P. Zhou, J.J. Guo, J. Sun, D.F. Zou DF., Theoretical research and simulation [22] A. Mahmoodzadeh, M. Mohammadi, H.F. Hama Ali, S.N. Abdulhamid, H.
analysis on the cutter spacing of double disc cutters breaking rock, KSCE J. Civ. H. Ibrahim, K.M.G. Noori, Dynamic prediction models of rock quality designation
Eng. 23 (2019) 3218–3227, https://doi.org/10.1007/s12205-019-1777-4. in tunneling projects, Transport. Geotech. 27 (2021) 100497, https://doi.org/
[3] R. Wang, Y. Wang, J. Li, L. Jing, G. Zhao, L. Nie, A TBM cutter life prediction 10.1016/j.trgeo.2020.100497.
method based on rock mass classification, KSCE J. Civ. Eng. 24 (2020) 2794–2807, [23] A. Mahmoodzadeh, M. Mohammadi, H.H. Ibrahim, K.M.G. Noori, S.
https://doi.org/10.1007/s12205-020-1511-2. N. Abdulhamid, H.F. Hama Ali, Forecasting sidewall displacement of underground
[4] Z. Zhang, M. Aqeel, C. Li, F. Sun, Theoretical prediction of wear of disc cutters in caverns using machine learning techniques, Autom. Constr. 123 (2021) 103530,
tunnel boring machine and its application, J. Rock Mech. Geotech. Eng. 11 (2019) https://doi.org/10.1016/j.autcon.2020.103530.
111–120, https://doi.org/10.1016/j.jrmge.2018.05.006. [24] H.Q. Yang, Z. Li, T.Q. Jie, Z.Q. Zhang, Effects of joints on the cutting behavior of
[5] R. Plinninger, H.K. Asling, K. Thuro, G. Spaun, Testing conditions and disc cutter running on the jointed rock mass, Tunn. Undergr. Space Technol. 81
geomechanical properties influencing the CERCHAR abrasiveness index (CAI) (2018) 112–120, https://doi.org/10.1016/j.tust.2018.07.023.
value, Int. J. Rock Mech. Min. Sci. 40 (2003) 259–263, https://doi.org/10.1016/ [25] A. Mahmoodzadeh, M. Mohammadi, A.H.M. Aldalwie, H.H. Ibrahim, T.A. Rashid,
S1365-1609(02)00140-5. H.F. Hama Ali, Tunnel geomechanical parameters prediction using Gaussian
[6] T.N. Michalakopoulos, V.G. Anagnostou, M.E. Bassanou, G.N. Panagiotou, The process regression, Mach. Learn. Appl. 3 (2021) 100020, https://doi.org/10.1016/
influence of steel styli hardness on the Cerchar abrasiveness index value, Int. J. j.mlwa.2021.100020.
Rock Mech. Min. Sci. 43 (2006) 321–327, https://doi.org/10.1016/j. [26] Y. Li, W. Zhang, Investigation on passive pile responses subject to adjacent
ijrmms.2005.06.009. tunnelling in anisotropic clay, Comput. Geotech. 127 (2020) 103782, https://doi.
[7] J. Hassanpour, Development of an empirical model to estimate disc cutter wear for org/10.1016/j.compgeo.2020.103782.
sedimentary and low to medium grade metamorphic rocks, Tunn. Undergr. Space [27] F. Chen, L. Wang, W. Zhang, Reliability assessment on stability of tunnelling
Technol. 75 (2018) 90–99, https://doi.org/10.1016/j.tust.2018.02.009. perpendicularly beneath an existing tunnel considering spatial variabilities of rock
[8] Q.S. Liu, J.P. Liu, Y.C. Pan, X.P. Zhang, X.X. Peng, Q.M. Gong, L.J. Du, A wear rule mass properties, Tunn. Undergr. Space Technol. 88 (2019) 276–289, https://doi.
and cutter life prediction model of a 20-in. TBM cutter for granite: a case study of a org/10.1016/j.tust.2019.03.013.
water conveyance tunnel in China, Rock Mech. Rock. Eng. 50 (2017) 1303–1320, [28] K. Elbaz, S.L. Shen, A. Zhou, Z.Y. Yin, H.M. Lyu, Prediction of disc cutter life during
https://doi.org/10.1007/s00603-017-1176-4. shield tunneling with AI via the incorporation of a genetic algorithm into a GMDH-
[9] A. Glowacz, Acoustic fault analysis of three commutator motors, Mech. Syst. Signal type neural network, Engineering 7 (2020) 238–251, https://doi.org/10.1016/j.
Process. 133 (2019) 106226, https://doi.org/10.1016/j.ymssp.2019.07.007. eng.2020.02.016.
[10] X.X. Liu, S.L. Shen, Y.S. Xu, Z.Y. Yin, Analytical approach for time-dependent [29] S.R. Torabi, H. Shirazi, H. Hajali, M. Monjezi, Study of the influence of
groundwater inflow into shield tunnel face in confined aquifer, Int. J. Neumer. geotechnical parameters on the TBM performance in Tehran–Shomal highway
Anal. Method Geomech. 42 (2018) 655–673, https://doi.org/10.1002/nag.2760. project using ANN and SPSS, Arab. J. Geosci. 6 (2013) 1215–1227, https://doi.
[11] A. Mahmoodzadeh, M. Mohammadi, A. Daraei, T.A. Rashid, A.F.H. Sherwani, R. org/10.1007/s12517-011-0415-3.
H. Faraj, A.M. Darwesh, Updating ground conditions and time-cost scatter-gram in [30] Y. Yang, K. Hong, Z. Sun, K. Chen, F. Li, J. Zhou, B. Zhang, The derivation and
tunnels during excavation, Autom. Constr. 105 (2019) 102822, https://doi.org/ validation of TBM disc cutter wear prediction model, Geotech. Geol. Eng. 36
10.1016/j.autcon.2019.04.017. (2018) 3391–3398, https://doi.org/10.1007/s10706-018-0540-9.
[12] K. Elbaz, S.L. Shen, A.N. Zhou, D.J. Yuan, Y.S. Xu, Optimization of EPB shield [31] Y. Xia, K. Zhang, J. Liu, Design optimization of TBM disc cutters for different
performance with adaptive neuro-fuzzy inference system and genetic algorithm, geological conditions, World J. Eng. Technol. 3 (2015) 218–231. https://doi.
Appl. Sci. 9 (2019) 780, https://doi.org/10.3390/app9040780. org/10.4236/wjet.2015.34023.
[13] A. Mahmoodzadeh, M. Mohammadi, A. Daraei, R.H. Faraj, R.M.D. Omer, A.F. [32] J. Bergstra, Y. Bengio, Random search for hyper-parameter optimization, J. Mach.
H. Sherwani, Decision-making in tunneling using artificial intelligence tools, Tunn. Learn. Res. 13 (2012) 281–305. https://jmlr.org/papers/v13/bergstra12a.html.
Undergr. Space Technol. 103 (2020) 103514, https://doi.org/10.1016/j. [33] J. Bergstra, R. Bardenet, Y. Bengio, B. Kégl, Algorithms for hyper-parameter
tust.2020.103514. optimization, in: Proceedings of the 24th International Conference on Neural
[14] A. Mahmoodzadeh, M. Mohammadi, A. Daraei, H.F. Hama-Ali, A.I. Abdullah, N. Information Processing SystemsDecember, 2011, pp. 2546–2554. https://papers.
K. Al-Salihi, Forecasting tunnel geology, construction time and costs using machine nips.cc/paper/2011/hash/86e8f7ab32cfd12577bc2619bc635690-Abstract.html.
learning methods, Neural Comput. & Applic. 33 (2021) 321–348, https://doi.org/ [34] A. Zheng, M. Bilenko, Lazy paired hyper-parameter tuning, in: Proceedings of the
10.1007/s00521-020-05006-2. Twenty-Third international joint conference on Artificial Intelligence, 2013,
[15] A. Glowacz, Fault diagnosis of electric impact drills using thermal imaging, pp. 1924–1931, https://doi.org/10.5555/2540128.2540404.
Measurement 171 (2021) 108815, https://doi.org/10.1016/j. [35] D. Maclaurin, D. Duvenaud, R. Adams, Gradient-based Hyperparameter
measurement.2020.108815. optimization through reversible learning, in: Proceedings of the 32nd International
[16] A. Mahmoodzadeh, M. Mohammadi, A. Daraei, H.F. Hama-Ali, N.K. Al-Salihi, R.M. Conference on Machine Learning, PMLR 37, 2015, pp. 2113–2122, in: http://pro
D. Omer, Forecasting maximum surface settlement caused by urban tunneling, ceedings.mlr.press/v37/maclaurin15.html.
Autom. Constr. 120 (2020) 103375, https://doi.org/10.1016/j. [36] P. Hall, B.U. Park, R.J. Samwort, Choice of neighbor order in nearest-neighbor
autcon.2020.103375. classification, Ann. Stat. 36 (2018) 2135–2152, https://doi.org/10.1214/07-
[17] W. Zhang, A.T.C. Goh, Multivariate adaptive regression splines and neural network AOS537.
models for prediction of pile drivability, Geosci. Front. 7 (2016) 45–52, https:// [37] A.B. Hassanat, M.A. Abbadi, G.A. Altarawneh, A.A. Alhasanat, Solving the problem
doi.org/10.1016/j.gsf.2014.10.003. of the K parameter in the KNN classifier using an ensemble learning approach, Int.
[18] A.T.C. Goh, W. Zhang, Y. Zhang, X. Yang, Y. Xiang, Determination of earth pressure J. Comput. Sci. Inform. Secur. 12 (2014) 33–39. https://arxiv.org/abs/1409.0919.
balance tunnel-related maximum surface settlement: a multivariate adaptive [38] A. Celisse, T. Mary Huard, Theoretical analysis of cross-validation for estimating
regression splines approach, Bull. Eng. Geol. Environ. 77 (2018) 489–500, https:// the risk of the k-nearest neighbor classifier, J. Mach. Learn. Res. 19 (2018) 1–54.
doi.org/10.1007/s10064-016-0937-8. https://jmlr.csail.mit.edu/papers/v19/15-498.html.

14

You might also like