Advanced Machine Learning Techniques For Predictin
Advanced Machine Learning Techniques For Predictin
1 School of Architecture, New Jersey Institute of Technology, Newark, NJ 07102, USA; [email protected]
2 School of Applied Engineering and Technology, New Jersey Institute of Technology, Newark, NJ 07102, USA;
[email protected]
* Correspondence: [email protected]
input features, which makes them suitable for predicting the properties of various concrete
types, including those modified with Supplementary Materials such as fly ash, nano-silica,
recycled aggregates, and other industrial by-products.
Several studies have applied ML models to predict concrete compressive strength with
notable success. Alghrairi et al. [4] developed nine ML models to estimate the compressive
strength of lightweight concrete modified with nanomaterials. Among these, the gradient-
boosted trees (GBT) model outperformed others by achieving a coefficient of determination
(R2 ) of 0.90 and a root mean square error (RMSE) of 5.286 MPa. The study highlighted that
water content was the most influential factor affecting compressive strength predictions and
emphasized the critical role of the water-to-cement ratio in concrete mix design. Similarly,
Ding et al. [5] investigated ML models to predict the compressive strength of alkali-activated
cementitious materials using solid waste components. They employed six ML algorithms,
including support vector machine (SVM), random forest (RF), radial basis function neural
network (RBF), and long short-term memory network (LSTM). The SVM model achieved the
highest performance with an R2 of 0.9054 and a normalized root mean square error of 0.0997.
In addition to the evaluation of prediction accuracy, feature importance analysis using
SHapley Additive exPlanations (SHAP) revealed key influencing factors such as calcium
oxide content, water-to-binder ratio, silicon dioxide content, modulus of water glass, and
aluminum trioxide content. Ekanayake et al. [6] addressed the “black-box” nature of ML
models by employing SHAP to interpret predictions of concrete compressive strength.
Utilizing tree-based algorithms including XGBoost and light gradient boosting machine
(LGBM), they achieved high accuracy with an R-value of 0.98. The SHAP analysis provided
insights into feature importance and confirmed that age and cement content were the most
influential features. This approach demonstrated that ML models could capture complex
relationships among variables and lead to enhanced trust among domain experts.
Despite these advancements, a persistent limitation in the existing literature is the
inadequate exploration of feature interactions and their cumulative impact on model pre-
dictions. Most studies emphasize achieving high predictive accuracy without thoroughly
investigating how input variables interact within the models. For instance, Paudel et al. [7]
compared the performance of non-ensemble and ensemble ML models in predicting the
compressive strength of concrete containing fly ash. The study identified age, cement
content, and water content as the most influential features but lacked a comprehensive
analysis of feature interactions. Similarly, Song et al. [8] employed ML algorithms, includ-
ing gene expression programming (GEP), artificial neural network (ANN), decision tree
(DT), and bagging regressor, to predict the compressive strength of concrete with fly ash
admixture. While the study confirmed that the selection of input parameters and regressors
significantly affects the accuracy of predicted outcomes, it did not extensively explore
feature interactions. Tran et al. [9] evaluated the compressive strength of concrete made
with recycled concrete aggregates using six ML models. The GB_PSO model achieved the
highest prediction accuracy with an R2 of 0.9356. Feature importance analysis revealed that
cement content and water content were the most important factors affecting compressive
strength. However, the study primarily focused on individual feature importance rather
than the interactions between variables. Ahmad et al. [10] compared supervised ML al-
gorithms, including ANN, AdaBoost, and boosting, to predict the compressive strength
of geopolymer concrete containing high-calcium fly ash. This study demonstrated the
potential of ensemble methods in capturing complex patterns in data, which can lead to
more accurate predictions. Nevertheless, it did not explore the interactions among input
features. Anjum et al. [11] applied ensemble ML methods, including gradient boosting,
RF, bagging regressor, and AdaBoost regressor, to estimate the compressive strength of
fiber-reinforced nano-silica modified concrete. SHAP analysis revealed that the coarse
Infrastructures 2025, 10, 26 3 of 26
aggregate to fine aggregate ratio had a stronger negative correlation with compressive
strength, while specimen age positively affected it. The study highlighted the importance
of considering the interaction and effects of input parameters but did not provide a detailed
feature interaction analysis. Ullah et al. [12] predicted the compressive strength of sustain-
able foam concrete using individual and ensemble ML approaches, including SVM, RF,
bagging, boosting, and a modified ensemble learner. The study suggested that ensemble
learners significantly enhance the performance and robustness of ML models but did not
explore feature interactions in depth. Moreover, Kumar and Pratap [13] investigated the use
of ML models to predict the compressive strength of high-strength concrete and focused on
the influence of superplasticizer, sand, and water content. The study acknowledged the sig-
nificant influence of superplasticizer on compressive strength but lacked a comprehensive
analysis of feature interactions. Nguyen et al. [14] proposed a machine learning approach
using multivariate polynomial regression and automated feature engineering to predict
the compressive strength of ultra-high-performance concrete (UHPC). While this study
provided insights into feature interactions, it was specific to UHPC and did not address
broader concrete types.
These studies collectively demonstrate that while ML models can achieve high accuracy
in predicting concrete compressive strength, they often lack interpretability due to insufficient
analysis of feature interactions. Most focus on individual feature importance without exploring
how variables interact within the model to influence predictions. This limitation hinders the
practical application of ML models in concrete mix design optimization, as understanding the
synergistic effects among key variables is crucial. To address this gap, there is a pressing need
for research that not only leverages advanced ML models for predicting concrete properties
but also provides a thorough analysis of feature interactions and their collective impact on
model predictions. Such an approach would enhance the interpretability of the models, allow
for more informed decision-making in mix design optimization, and promote the development
of high-performance, durable, and sustainable concrete materials.
Recent research has also begun integrating advanced predictive modeling with sus-
tainability considerations. For example, Ref. [15] developed an ANN-based approach
for recycled aggregate concrete, offering high-accuracy compressive strength predictions
and practical closed-form solutions. In a related study, Ref. [16] examined ultra-high-
performance lightweight concrete incorporating rice husk ash, applying life cycle assess-
ment (LCA) to evaluate the environmental performance alongside compressive strength.
Similarly, Ref. [17] employed multiple AI and optimization techniques to investigate inter-
actions between fly ash content, mechanical properties, and environmental impact, thereby
informing multi-objective optimization of sustainable concrete mixes. These contributions
underscore a growing emphasis on not only predicting performance but also considering
environmental implications. Nevertheless, even with these advancements, a persistent gap
remains in the literature: the need for a more thorough exploration of feature interactions
and their collective influence on model predictions. Addressing this gap is crucial for both
interpretability and practical utility in concrete mix design.
Unlike prior work that predominantly focuses on predictive accuracy, our approach not
only aims to achieve high accuracy but also provides in-depth interpretability by examining
feature interactions using SHAP and partial dependence plots. This dual focus on accuracy
and interpretability represents a key advancement over current methodologies to enable
more informed decision-making in concrete mix design. This study aims to fulfill this need
by developing machine learning models capable of predicting the compressive strength of
various concrete types, including diverse input variables related to mix composition. By
employing advanced feature importance analysis methods such as SHAP and interaction
effects such as partial dependence plots, we investigate the interactions among these input
Infrastructures 2025, 10, 26 4 of 26
Figure 1.
Figure 1. Framework
Framework for
for modeling
modeling analysis
analysis of
of concrete
concretecompressive
compressivestrength.
strength.
standard in water usage for these concrete mixtures. Most samples contained low amounts
of blast furnace slag, with a significant peak at 0 kg/m3 , which highlights its optional use in
Infrastructures 2025, 10, x FOR PEER REVIEW
the mixtures. The majority of the data points were clustered at low superplasticizer content, 6 of 27
with a significant number of observations showing zero usage, emphasizing its selective
application depending on specific mix requirements. There was a significant spike in age
identified
at 28 days,and removed
which using the
is commonly interquartile
recognized as arange (IQR)
standard method
curing [21].
time forThe IQRconcrete
testing was cal-
culated as
strength thealthough
[24], difference between
other the 75th
ages were also (Q3) and 25th
represented to (Q1) percentiles,
a lesser extent. Theand any data
strength of
points lying
concrete below
showed Q1 − 1.5distribution
a normal IQR or abovewithQ3a +mean
1.5 IQR were considered
of around 35 MPa and outliers. Signifi-
illustrates the
cant outliers
common rangewere found inencountered
of strength variables such as age, concrete
in typical and these outliers were
applications. removed
This from
exploratory
the dataset
analysis to improve
provided model accuracy
a foundation and generalizability.
for understanding After outlierofremoval,
the key characteristics the
the dataset,
which informconsisted
final dataset the subsequent predictive modeling efforts.
of 911 observations.
Figure 4. Correlation matrix between the input features and the target variable.
Figure 4. Correlation matrix between the input features and the target variable.
2.2.4. Feature Engineering and Multicollinearity Analysis
2.2.4. Feature Engineering and Multicollinearity Analysis
Multicollinearity among predictor variables can negatively impact the stability and
Multicollinearity amongmodels
interpretability of regression predictor
by variables canvariance
inflating the negatively impact theestimates
of coefficient stability and
[27].
interpretability
To quantify theofdegree
regression models by inflating
of multicollinearity the the
among variance of coefficient
predictor estimates
variables, [27].
the variance
To quantify
inflation the (VIF)
factor degree of calculated
was multicollinearity among
using the the predictor variables,function
variance_inflation_factor() the variance
from
inflation factor (VIF) was calculated in
statsmodels.stats.outliers_influence using the variance_inflation_factor()
Python. The VIF for each feature isfunction
computed fromas
statsmodels.stats.outliers_influence in Python. The VIF for each feature is computed as
VIF = 1/(1 − R2), where R2 is obtained by regressing that feature against all other features.
The initial VIF analysis, presented in Figure 5a, revealed significant multicollinearity is-
sues. Notably, the VIF values for water, coarse aggregate, fine aggregate, and cement were
Infrastructures 2025, 10, 26 8 of 26
VIF = 1/(1 − R2 ), where R2 is obtained by regressing that feature against all other features.
The initial VIF analysis, presented in Figure 5a, revealed significant multicollinearity issues.
Notably, the VIF values for water, coarse aggregate, fine aggregate, and cement were
exceptionally high, with water exhibiting a VIF of 95.27, coarse aggregate at 84.71, fine
aggregate at 76.82, and cement at 14.15. Such high VIF scores indicate that these variables
are highly correlated with other predictors, which can destabilize regression models and
obscure the true relationships between variables and the target outcome.
To mitigate multicollinearity and enhance the predictive power of the models, feature
engineering was employed based on domain knowledge in concrete technology [28,29].
Two new features were created: the water–cement ratio (W/C ratio) and the coarse
aggregate–fine aggregate ratio (C/F ratio). The W/C ratio was calculated by dividing
the water content by the cement content. This ratio is a critical factor influencing concrete
strength, as it affects the hydration process and the microstructure of the hardened concrete.
A lower W/C ratio generally leads to higher strength and durability. The C/F ratio was
determined by dividing the coarse aggregate content by the fine aggregate content. This
ratio impacts the workability, compaction, and overall strength of concrete by influencing
the particle packing and void content within the mix [30].
Water
W/C Ratio = (1)
Cement
Infrastructures 2025, 10, x FOR PEER REVIEW 9 of 27
Coarse Aggregate
C/F Ratio = (2)
Fine Aggregate
By transforming
transformingthe theoriginal
original highly
highly correlated
correlated variables
variables into into ratios,
ratios, the absolute
the absolute quan-
quantities, previously
tities, previously exhibiting
exhibiting high multicollinearity,
high multicollinearity, were converted
were converted intomeasures
into relative relative
measures
that capture thatthe
capture theproportional
essential essential proportional
relationshipsrelationships in themix.
in the concrete concrete
This mix. This
approach
approach reduced redundancy
reduced redundancy among predictors
among predictors while the
while retaining retaining
criticalthe critical information
information necessary
necessary
for accurate forstrength
accurate strength prediction.
prediction. After featureAfter feature engineering,
engineering, the VIF wasfor
the VIF was recalculated recal-
the
culated
updatedfor setthe updated The
of features. set of features.
results, shownTheinresults,
Figureshown in Figure
5b, indicated 5b, indicated
a substantial a sub-
reduction
stantial reduction inacross
in multicollinearity multicollinearity
the dataset. across
The VIF the dataset.
values Thenewly
for the VIF values for thefeatures
engineered newly
engineered features
were significantly werewith
lower, significantly lower, with
the water–cement theatwater–cement
ratio ratio ataggregate–fine
10.24 and the coarse 10.24 and the
coarse
aggregateaggregate–fine
ratio at 7.98. aggregate
While these ratio at 7.98.
values While
are still these
above thevalues
commonlyare still above threshold
accepted the com-
monly accepted
of 5, they representthreshold
a marked of 5, they represent
improvement a marked
from improvement
the initial from the
VIF scores. These initialwere
features VIF
scores.
retained These
due features were retained
to their significant due toimportance
practical their significant practical importance
and contribution and con-
to the predictive
tribution
capabilitytoofthethepredictive
models. Othercapability of the
features alsomodels.
exhibited Other featuresVIF
acceptable also exhibited
values, accepta-
all below the
ble VIF values,
threshold of 5. all below the threshold of 5.
VIF results for input feature selection: (a) all initial features, (b) revised feature set.
Figure 5. VIF
An 80–20% training–testing split was selected to align with common machine learning
practices for robust evaluation [33]. To ensure that the training and testing subsets share
similar statistical characteristics, we first divided the target variable in the dataset into
ten quantile-based bins (num_bins = 10) and then performed a stratified split. After this
procedure, we computed descriptive statistics—record count, minimum, maximum, range,
Infrastructures 2025, 10, 26 10 of 26
mean, variance, and standard deviation—for each numeric feature. As presented in Table 4,
the training and testing sets exhibited very similar statistics. Additionally, Kolmogorov–
Smirnov tests [34] for each feature yielded high p-values (all > 0.05), which indicated no
statistically significant differences between the distributions of the two subsets. These
results confirmed that the testing set is representative of the training set and ensured that
the performance metrics derived from the test set are both reliable and unbiased. The
models were then trained on the training set and evaluated on the testing set.
Table 5. Hyperparameters considered for regression and classification models in this study.
Regression Classification
Model Hyperparameters Considered Model Hyperparameters Considered
Linear Logistic penalty (l1, l2), C (regularization
None (used ordinary least squares)
Regression Regression strength), solver (saga)
C (regularization), gamma (kernel
K-Nearest Support Vector
n_neighbors, metric, weights coefficient), kernel (linear, rbf, poly,
Neighbors Machine
sigmoid), degree (if kernel = poly)
n_neighbors, weights (uniform,
Decision Tree max_depth, min_samples_split, k-Nearest
distance), p (distance metric:
Regressor min_samples_leaf Neighbors
1 = Manhattan, 2 = Euclidean)
Random n_estimators, max_depth, Random
n_estimators, max_depth,
Forest min_samples_split, min_samples_leaf, Forest
min_samples_split, max_features
Regressor max_features Classifier
n_estimators, max_samples,
max_features, bootstrap,
Gradient Bagging
n_estimators, learning_rate, max_depth, bootstrap_features,
Boosting Classifier
subsample, min_samples_split estimator__max_depth,
Regressor (with DT)
estimator__criterion (for
DecisionTreeClassifier)
AdaBoost n_estimators, learning_rate, base_estimator
Regressor (DT max_depth)
Neural Number of layers, units per layer,
Network activation, dropout rate, batch size, epochs,
(MLP) optimizer, learning_rate, L2 regularization
1 n
n ∑ i =1 i
MSE = (y − ŷi )2 (3)
where n is the number of observations. A lower MSE indicates that the model’s predictions
are closer to the actual values, which signifies better predictive accuracy.
2
∑in=1 (yi − ŷi )
R2 = 1 − 2
(4)
∑in=1 (yi − y)
Infrastructures 2025, 10, 26 12 of 26
where y is the mean of the observed data. An R2 value closer to 1 indicates that a higher
proportion of variance is explained by the model, and it reflects a better fit.
Additionally, to provide a more comprehensive and intuitive visual comparison of
the regression models’ performance, a Taylor diagram was employed. The Taylor diagram
plots correlation (with the observed values), the ratio of the standard deviation of the
model predictions to that of the observations, and the centered RMS error, all on a single
polar coordinate plot [39]. This approach allows simultaneous evaluation of how well each
model’s variability and pattern of predictions match the observed data.
For classification models, accuracy was calculated to determine the overall effec-
tiveness of the model in correctly predicting the class labels. It is given by Equation (5).
However, in datasets with class imbalances, accuracy can be misleading because it may
be biased towards the majority class. To address this, balanced accuracy was used, which
adjusts for imbalanced classes by averaging the recall (sensitivity) obtained for each class.
It is defined by Equation (6).
where K is the number of classes, TPk is the number of true positives for class k, and FNk is
the number of false negatives for class k.
To gain deeper insights into the model’s performance on individual classes, precision,
recall, and F1-score [40] were calculated for each class. Precision measures the proportion of
correct positive predictions among all positive predictions, defined in Equation (7). Recall,
also known as sensitivity, assesses the model’s ability to correctly identify all positive
instances (see Equation (8)). The F1-score, as defined in Equation (9), is the harmonic mean
of precision and recall, which provides a single metric that balances both concerns.
TP
Precision = (7)
TP + FP
where TP is the number of true positives, and FP is the number of false positives.
TP
Recall = (8)
TP + FN
Precision × Recall
F1 − score = 2 × (9)
Precision + Recall
In multiclass classification settings with imbalanced classes, evaluating overall model
performance requires aggregating these per-class metrics. To account for the varying
number of instances in each class, weighted average precision, weighted average recall,
and weighted average F1-score were calculated. These metrics are computed by weighting
the per-class metrics by the number of true instances in each class to ensure classes with
more samples have a proportionally greater impact on the overall score.
The weighted average precision is calculated as follows:
∑kK=1 nk × Precisionk
Weighted Precision = (10)
∑kK=1 nk
where nk is the number of true instances in class k. Similarly, the weighted accuracy, average
recall, and F1-score were calculated.
Infrastructures 2025, 10, 26 13 of 26
The use of balanced accuracy and weighted metrics is particularly important in the
presence of class imbalance, which was evident in our dataset (see Table 3). Certain
strength categories had significantly more samples than others, which could bias the
model’s performance towards those classes. The confusion matrix was also utilized to
visualize the performance of the classification models by displaying the counts of true
positive, true negative, false positive, and false negative predictions for each class. This
matrix allowed for a detailed error analysis by highlighting specific areas where the model
was misclassifying observations.
To optimize model performance and ensure robust hyperparameter selection, bayesian
optimization was conducted using 5-fold cross-validation. This involved partitioning the
training dataset into five equal subsets, training the model on four subsets, and evaluating
its performance on the remaining subset. By averaging the performance across folds, this
approach provides a more reliable estimate of the model’s generalization ability and helps
mitigate the risk of overfitting during hyperparameter tuning.
where Nm is the number of samples at node m; yi is the target value of sample i; and y Nm
is the mean target value at node m. When a node m is split on feature j, the decrease in
impurity ∆I ( j, m) due to that feature is calculated as follows:
Nle f t Nright
∆I ( j, m) = I (m) − I (le f t) + I (right) (12)
Nm Nm
where Nle f t and Nright are the numbers of samples in the left and right child nodes, and
I (le f t) and I (right) are the impurities of the left and right child nodes. The mean decrease
in impurity for feature j across all trees T in the ensemble is then as follows:
Infrastructures 2025, 10, 26 14 of 26
1
| T | ∑ t ∈ T ∑ m ∈ Mt
MDIj = ∆It ( j, m) (13)
where Mt is the set of all nodes where feature j is used to split in tree t, and ∆It ( j, m) is the
decrease in impurity for feature j at node m in tree t. A higher MDI value indicates greater
importance of the feature in reducing the overall impurity of the model.
where F is the set of all features, { j} denotes the set containing only feature j, S is a subset
of features not containing feature j, |S| is the number of features in subset S,
f S ( xS ) is the
model trained with features in subset S evaluated at xS , and f s∪{ j} xS∪{ j} is the model
trained with features in subset S ∪ { j} evaluated at xS∪{ j} .
1 n
fˆPD ( xs ) = ∑i=1 fˆ xs , xC
(i )
(15)
n
where fˆ is the trained predictive function (the best-performing regressor model), xs is the
(i )
feature (or set of features) for which the partial dependence is computed, xC represents
the values of all other features C (the complement of s) for instance i in the dataset, and n is
the number of instances in the dataset.
1 n
fˆPD ( xs1 , xs2 ) = ∑i=1 fˆ xs1 , xs2 , xC
(i )
(16)
n
In this study, PDPs were generated for the top two most influential features iden-
tified in the feature importance analysis. Additionally, a two-way PDP was created to
examine the interaction effect between these two features on the predicted compressive
strength. The partial dependence functions fˆPD ( xs ) and fˆPD ( xs1 , xs2 ) were calculated
using the PartialDependenceDisplay.from_estimator method from the scikit-learn library.
Infrastructures 2025, 10, 26 15 of 26
The method systematically varies the feature(s) of interest while averaging out the effects
of all other features.
3. Results
3.1. Regression Analysis
The regression models were evaluated based on their MSE and R2 values, as summa-
rized in Table 6. This table provides a clear comparison of their effectiveness in predicting
concrete compressive strength. The GBR emerged as the top performer with an MSE of
15.79 and an R2 value of 0.94, which indicates its ability to explain 94% of the variance in
compressive strength. Following closely, the RF regressor captured a significant portion
of the target variable’s variance with an R2 value of 0.91 and an MSE of 21.61. Both the
neural network model and AdaBoost also showed strong results, each with R2 values of
0.90. The KNN model demonstrated a moderate fit with an R2 of 0.84 and an MSE of 39.88,
while the decision tree regressor posted an MSE of 42.67 and an R2 of 0.83. The linear
regression model, simpler and less robust, managed an R2 of 0.69 and an MSE of 71.25,
which highlights its limited capacity to capture complex patterns in the data.
Model MSE R2
gradient boosting regressor 15.79 0.94
RF regressor 21.61 0.91
neural network model 24.20 0.90
AdaBoost 24.27 0.90
k-nearest neighbors 39.88 0.84
decision tree regressor 42.67 0.83
linear regression 71.25 0.69
Our R2 of 0.94 closely matches Alghrairi et al. [4]’s R2 of 0.90 using a gradient-boosted
trees model for nanomaterial lightweight concrete. This improvement is possibly due to
our ratio-based features (W/C and C/F) and enhanced hyperparameter tuning. Similarly,
Ding et al. [5] found that ensemble methods like RF and SVM outperformed single models
in predicting the compressive strength of alkali-activated materials.
To complement the statistical summary in Table 6, Figure 6 presents a Taylor diagram
that visually compares the predictions of each model to the observed compressive strengths.
In this diagram, the distance from the origin corresponds to the models’ standard deviations,
and their angular position represents the correlation with the observed data. Additionally,
the annotations near each model’s marker show the centered RMS (CRMS) error, which
provides a measure of how closely the model predictions match the observed values after
removing any bias. From Figure 6, we see that the GBR and RF models not only rank
highly in terms of MSE and R2 but also cluster closer to the observed standard deviation
reference point, exhibit higher correlations, and have lower CRMS errors. These visual
insights confirm and reinforce the numerical findings presented in Table 6. Meanwhile,
(CRMS) error, which provides a measure of how closely the model predictions match the
observed values after removing any bias. From Figure 6, we see that the GBR and RF
models not only rank highly in terms of MSE and R2 but also cluster closer to the observed
Infrastructures 2025, 10, 26 16 of 26
standard deviation reference point, exhibit higher correlations, and have lower CRMS er-
rors. These visual insights confirm and reinforce the numerical findings presented in Table
6.the
Meanwhile, the neural
neural network network and
and AdaBoost AdaBoost
models models
maintain maintain
strong strongand
correlations correlations
relativelyand
low
relatively low CRMS errors, which align well with their high R 2 values. In contrast, the
CRMS errors, which align well with their high R2 values. In contrast, the KNN and decision
KNN and decision
tree models, whiletree models, while
moderately moderately
correlated, correlated,
display display
larger CRMS largerconsistent
errors, CRMS errors,
with
consistent with their higher MSE values. The linear regression model
their higher MSE values. The linear regression model stands out as having stands out asweakest
the having
the weakest and
correlation correlation and CRMS
the highest the highest
error,CRMS error,
mirroring its mirroring its poor performance
poor performance in terms of MSEin
terms of
and R .2 MSE and R 2.
The robustness of the GBR is further supported by Figure 7a,b. In Figure 7a, the residual
plot demonstrates that the residuals are randomly scattered around zero, which indicates
the absence of systematic patterns or biases. The residual variance is consistent across the
predicted values and suggests that the model performs reliably across the range of compressive
strengths. This uniformity reinforces the model’s superior fit. In Figure 7b, the “actual vs.
predicted values” plot shows points closely aligned with the ideal red dashed line, which
highlights the model’s accuracy in predicting the actual values. The tight clustering around
this line supports the model’s ability to make precise predictions.
In addition to evaluating model performance on the full dataset, we investigated how
model accuracy changes with different training set sizes. Figure 8 illustrates the relationship
between subset size and gradient-boosting regressor performance. Initially, as the subset
size increases from 30 samples upward, the R2 score improves dramatically, while the
MSE decreases significantly. Beyond approximately 400 samples, the improvement in
R2 and reduction in MSE become marginal, suggesting that the model has captured the
underlying data patterns sufficiently well. Hence, while larger datasets can still provide
benefits, a dataset size of around 400 observations appears to be a practical lower bound
for achieving near-optimal performance in this particular problem. This analysis suggests
that the current cleaned dataset size of 911 observations is more than sufficient for stable
sidual plot demonstrates that the residuals are randomly scattered around zero, which
indicates the absence of systematic patterns or biases. The residual variance is consistent
Infrastructures 2025, 10, 26 across the predicted values and suggests that the model performs reliably across the 17range
of 26
of compressive strengths. This uniformity reinforces the model’s superior fit. In Figure 7b,
the “actual vs. predicted values” plot shows points closely aligned with the ideal red
and high-quality
dashed predictions,
line, which highlightsand
thesmaller
model’sdatasets (onin
accuracy the order of athe
predicting fewactual
hundred samples)
values. The
could still achieve near-optimal results, given a similar data distribution and complexity.
tight clustering around this line supports the model’s ability to make precise predictions.
2 score and MSE vs. dataset size (80/20 split) for the GBR model.
Figure 8. R
Figure 8. R2 score and MSE vs. dataset size (80/20 split) for the GBR model.
Figure 9.
Figure Confusion matrix
9. Confusion matrix for
for SVM
SVM (heatmap
(heatmap colors
colors darken
darken as
as count
count increase)
increase) and
and classification
classification
matrix by class.
matrix by class.
To provide
To provide further
furtherclarity
clarityononmodel
model reproducibility,
reproducibility, Table
Table 8 presents
8 presents the final
the final hy-
hyperparameter configurations obtained through Bayesian optimization
perparameter configurations obtained through Bayesian optimization for the top-per-for the top-
performing
forming regression
regression model
model (GBR)
(GBR) and and the top-performing
the top-performing classification
classification models
models (SVM).
(SVM). De-
Detailed hyperparameters and tuning procedures for all other models are available
tailed hyperparameters and tuning procedures for all other models are available in the in the
Supplementary Materials.
Supplementary Materials.
Hyperparameter Tun-
Hyperparameter
Model Hyperparameters Considered Initial/Default Values
Values Best/Tuned
Best/Tuned Values
Values
Tuning
ing Method
Method
n_estimators= = 100, n_estimators
n_estimators = 500,
= 500,
n_estimators 100,
n_estimators,learning_rate,
learning_rate, learning_rate = 0.1, learning_rate
learning_rate = 0.2057,
= 0.2057,
n_estimators, learning_rate = 0.1, Bayesian
GRB “max_depth, subsample, max_depth = 3, max_depth
max_depth = 10,
= 10,
GRB “max_depth, subsample, max_depth = 3, Optimization
Bayesian Optimization
min_samples_split subsample = 1.0, subsample
subsample= 0.5,
= 0.5,
min_samples_split subsample = 1.0,
min_samples_split =2 min_samples_split = 0.242
min_samples_split =
min_samples_split = 2
C (regularization), C = 1.0, 0.242
C ≈ 5.68 × 10 , 5
Cgamma
(regularization), Ckernel
= 1.0, = ‘rbf’, Bayesian C ≈ 5.68 × 105,
(kernel coefficient), gamma ≈ 0.1434,
SVM Optimization
gamma (kernelrbf,
kernel (linear, coefficient),
poly, sigmoid), kernel
gamma = rbf’,
= ‘scale’, Bayesian Optimization gamma
kernel ≈ 0.1434,
= ’rbf’,
SVM (BayesSearchCV)
degree (if kernel = poly) degree = 3
kernel (linear, rbf, poly, sigmoid), gamma = scale’, (BayesSearchCV) degree = 5
kernel = ’rbf’,
degree (if kernel = poly) degree = 3 degree = 5
To further assess
3.4. Understanding the impact
Feature of eachwith
Contributions feature
SHAP onAnalysis
the model’s performance, an ablation
studyTowas conducted. In this study, features were progressively
gain a deeper understanding of the GBR’s predictive behavior removedand from the model
interpret its
in order of increasing importance (starting with the least important feature),
predictions, we employed SHAP analysis. This method allows for both global and local and the model
was retrained each
interpretability and time.
revealsThetheMSE and R2 values
contribution were
of each recorded
feature to theatmodel’s
each step to evaluate
output across
how the removal of features affected the model’s predictive accuracy.
the entire dataset and for individual predictions. Figure 11a presents the SHAP summary
The ablation study (Figure 10b) shows the impact of incrementally removing features
plot, which displays the global feature importance. Each point on the plot represents a
on both R2 and MSE and clarifies each feature’s individual contribution to the model’s
SHAP value for a feature and an instance. The features are ordered by their overall
performance. Starting with all six variables (Water_Cement_Ratio, Age, Blast Furnace Slag,
Superplasticizer, Coarse_Fine_Ratio, and Fly Ash), we obtained an R2 of 0.9394 and an MSE
of 15.7961. Removing Fly Ash had a minimal effect on accuracy (R2 = 0.9366, MSE = 16.5504),
which indicates that although it adds some predictive value, its contribution is relatively
modest compared to the top-ranked features. Further reducing the feature set led to more
substantial declines: while retaining only the top three predictors—Water_Cement_Ratio,
Age, and Blast Furnace Slag—still achieved a commendable R2 of 0.9027, the MSE increased
to 25.3888. Narrowing down to just two features (Water_Cement_Ratio and Age) caused
R2 to drop to 0.7752 and MSE to rise to 58.6519, and relying solely on Water_Cement_Ratio
produced a drastic decline (R2 = 0.1501, MSE = 221.6960). These results emphasize the
importance of multiple synergistic features in achieving both high R2 and low MSE, with
Water_Cement_Ratio, Age, and Blast Furnace Slag being particularly influential. Con-
versely, features like Fly Ash and Coarse_Fine_Ratio demonstrate lower predictive accu-
racy due to weaker direct correlations with compressive strength or their effects being
overshadowed by more dominant parameters. Fly Ash, for instance, may improve strength
and durability under certain conditions but exerts a more subtle or context-dependent
influence on early-age compressive strength, which makes its overall contribution less
Infrastructures 2025,
Infrastructures 10, x26FOR PEER REVIEW
2025, 10, 2120ofof27
26
Figure11.
Figure (a) Feature
11. (a) Feature importance
importance analysis:
analysis: SHAP
SHAP summary
summary plot;
plot; (b)
(b) contribution
contribution analysis:
analysis: SHAP
SHAP
waterfall plot showing feature contributions for an actual concrete strength of 61.89
waterfall plot showing feature contributions for an actual concrete strength of 61.89 MPa. MPa.
Figure
The 11bdependence
partial shows a SHAP plotswaterfall
presented plot for a specific
in Figure instance
12 highlight with an actual
the influence con-
of water–
crete strength of 61.89 MPa. This plot provides a local explanation to illustrate
cement ratio and age on the compressive strength of concrete, as predicted by the gradient how each
feature contributes
boosting model. Figureto the
12amodel’s
displaysprediction
a markedfor this particular
decrease in concreteinstance. The base
compressive value,
strength
represented by E[f(X)], is the average prediction of the model across the
as the water–cement ratio increases from around 0.3 to 1.25. Initially, the decline is sub- entire dataset
(32.489 MPa).
stantial, Each bar
particularly in theratios
between plot represents a feature,
of 0.3 to 0.75 and its length
which indicates corresponds
that lower to the
ratios signif-
icantly enhance the concrete’s strength. Beyond a ratio of 0.75, the negative impact the
SHAP value to indicate the magnitude and direction of the feature’s contribution to on
final prediction.
strength continuesForbutthis instance,
becomes lessthe water–cement
pronounced, ratio ofless
eventually 0.3 has the largest
significant afterpositive
a ratio
contribution
of (+30.82),that
1.0. This suggests significantly increasing
maintaining the prediction
a water–cement ratiofrom
below the0.75
base
isvalue.
criticalThe
forage of
opti-
28 days
mal also strength.
concrete contributes Thepositively (+4.37),(Figure
age of concrete further12b)
increasing
shows athe predicted
robust strength.
positive Con-
correlation
versely, the absence of blast furnace slag ( − 3.85), fly ash ( − 1.75), and a moderate
with its compressive strength. From day 0 to approximately 50 days, there is a sharp in- amount of
superplasticizer (−0.711) contribute negatively, slightly lowering the prediction. The coarse
crease in strength, which reflects the critical curing phase, in which concrete gains most
of its compressive strength. Beyond 50 days, the rate of increase in strength diminishes,
becoming more gradual up to 100 days. The step increase in strength at around 100 days
Infrastructures 2025, 10, 26 21 of 26
aggregate–fine aggregate ratio has a small positive impact (+1.31). The final prediction (f(x))
of 62.667 MPa is the sum of the base value and all the individual feature contributions.
The partial dependence plots presented in Figure 12 highlight the influence of water–
cement ratio and age on the compressive strength of concrete, as predicted by the gradient
boosting model. Figure 12a displays a marked decrease in concrete compressive strength as
the water–cement ratio increases from around 0.3 to 1.25. Initially, the decline is substantial,
particularly between ratios of 0.3 to 0.75 which indicates that lower ratios significantly
enhance the concrete’s strength. Beyond a ratio of 0.75, the negative impact on strength
continues but becomes less pronounced, eventually less significant after a ratio of 1.0.
This suggests that maintaining a water–cement ratio below 0.75 is critical for optimal
concrete strength. The age of concrete (Figure 12b) shows a robust positive correlation
with its compressive strength. From day 0 to approximately 50 days, there is a sharp
Infrastructures 2025, 10, x FOR PEER REVIEW
increase in strength, which reflects the critical curing phase, in which concrete gains 22 most
of 27
of its compressive strength. Beyond 50 days, the rate of increase in strength diminishes,
becoming more gradual up to 100 days. The step increase in strength at around 100 days
might
might indicate
indicate specific
specific curing
curing oror environmental
environmental conditions
conditions affecting
affectingthethe concrete’s
concrete’slong-
long-
term strength characteristics. The interaction plot (Figure 12c) elucidates how
term strength characteristics. The interaction plot (Figure 12c) elucidates how combinationscombina-
tions of water–cement
of water–cement ratio ratio andimpact
and age age impact concrete
concrete strength.
strength. At early
At early agesages
(0–20(0–20
days) days)
and
and lower water–cement ratios (0.3–0.5), the concrete strength is highest, which
lower water–cement ratios (0.3–0.5), the concrete strength is highest, which emphasizes the empha-
sizes the importance
importance of both
of both proper properratios
mixture mixture
andratios and sufficient
sufficient curing
curing time. time.
As the ageAs the age
increases,
increases, even higher water–cement ratios (up to 1.5) show a less detrimental
even higher water–cement ratios (up to 1.5) show a less detrimental effect on the strength, effect on
the strength,inparticularly
particularly in concrete
concrete aged aged This
over 60 days. overinteraction
60 days. This interaction
suggests suggests influence
a diminishing a dimin-
ishing influence of the water–cement ratio on strength
of the water–cement ratio on strength as the concrete matures. as the concrete matures.
Figure 12.
Figure (a) Partial
12. (a) Partial dependence
dependence plot
plot for
for water–cement
water–cement ratio;
ratio; (b)
(b) partial
partial dependence
dependenceplot
plotfor
for age;
age;
(c) partial dependence plot showing the combined influence of water–cement ratio and
(c) partial dependence plot showing the combined influence of water–cement ratio and age (coolerage (cooler
colors (purple zones) indicate lower partial dependence values, warmer colors (greenish zones)
colors (purple zones) indicate lower partial dependence values, warmer colors (greenish zones) in-
indicate higher values).
dicate higher values).
4. Discussion
4. Discussion
The findings of this study highlight the significant potential of machine learning
models infindings
The of predicting
accurately this studyandhighlight the significant
classifying potential
the compressive of machine
strength learning
of concrete based
models in accurately predicting and classifying the compressive strength
on its mix design parameters and curing age. The superior performance of the GBR of concrete
based on its mix
underscores the design parameters
effectiveness and curing
of ensemble age. The
methods superior performance
in capturing the complex,of the GBR
non-linear
underscores the effectiveness of ensemble methods in capturing the complex, non-linear
relationships inherent in concrete materials. This discussion elaborates on the implications
relationships inherent
of these results, in concrete
the insights materials.
gained from This discussion
feature elaborates
importance on the
analyses, theimplications
challenges
of
encountered, and the broader impact on the field of concrete technology. challenges en-
these results, the insights gained from feature importance analyses, the
countered, and the broader impact on the field of concrete technology.
learning models within the construction industry, where decisions have significant safety
and financial implications. By demonstrating that the model’s behavior aligns with domain
knowledge, stakeholders are more likely to adopt these data-driven approaches. The ability
to predict strength without extensive laboratory testing accelerates the design process and
enhances project efficiency.
It is also important to note that the strong influence of the water–cement ratio and
curing age in our analysis concurs with numerous prior investigations. For instance, Ding
et al. [5] and Ekanayake et al. [6] both identified age (or curing duration) as a dominant
factor in concrete strength evolution, while Alghrairi et al. [4] and Anjum et al. [11] em-
phasized the significant role of water content. Our SHAP-based interpretability analysis
(Section 3.4) parallels these findings and demonstrates that small changes in W/C ratio
lead to sizable shifts in predicted strength. Moreover, partial dependence plots revealed
synergy between W/C ratio and curing time, aligning with earlier studies that used SHAP
or feature-importance techniques for clarity [6,11]. As a result, our results substantiate
that data-driven ranking of variables (e.g., W/C ratio, age) resonates strongly with well-
established concrete fundamentals.
tion or integrated gradients is crucial for industry adoption, ensuring that complex models
remain transparent and trustworthy.
Furthermore, integrating predictive models into user-friendly decision support sys-
tems, such as software tools or mobile applications, and incorporating optimization al-
gorithms can facilitate practical use by practitioners, enabling automated mix design
suggestions tailored to specific project requirements. The adoption of machine learning
models in concrete technology also raises ethical and environmental considerations. Opti-
mizing mix designs for strength and cost must be balanced with sustainability goals, such
as reducing carbon emissions associated with cement production. Future models could
incorporate environmental impact metrics to support eco-friendly decision-making.
5. Conclusions
The present study demonstrated the effectiveness of machine learning algorithms,
especially ensemble techniques such as gradient boosting, in making precise predictions
and classifying the compressive strength of concrete based on mix design parameters and
curing duration. Using advanced feature importance analysis techniques, including SHAP
values and partial dependence plots, allowed us to delve into the details of how the input
variables interact with each other in these models to affect the predictions. These results
have shown the potential of machine learning models to enhance mix design optimization,
quality assurance, and fulfillment of engineering standards. SHAP analysis allowed a better
insight into feature contributions on both a global and local level, thus possibly increasing
model interpretability. The ability to predict strength without extensive laboratory testing
accelerates the design process, reduces costs, and promotes more efficient project timelines.
However, the study acknowledges certain limitations. While the dataset used is com-
prehensive, it may not capture all possible variations in raw materials, environmental
conditions, and construction practices across different regions. Exploring deep learning
approaches and integrating real-time monitoring data could uncover more complex relation-
ships and enhance the model’s robustness. Additionally, improving model interpretability
is essential for ensuring widespread adoption in the industry.
Supplementary Materials: The following supporting information, including the code used for
data preprocessing, feature engineering, model development, and analysis in this study, can be
downloaded at: https://github.com/mnikoopayan/Concrete-Compressive-Strength (accessed on 12
July 2024).
Author Contributions: Conceptualization, M.S.N.T. and Y.F.; methodology, M.S.N.T. and Y.F.; soft-
ware, M.S.N.T.; validation, M.S.N.T., Y.F. and M.M.; formal analysis, M.S.N.T. and Y.F.; investigation,
M.S.N.T.; resources, M.S.N.T. and M.M.; data curation, M.S.N.T. and M.M.; writing—original draft
preparation, M.S.N.T. and Y.F.; writing—review and editing, Y.F. and M.M.; visualization, M.S.N.T.
and Y.F.; supervision, Y.F.; project administration, Y.F.; funding acquisition, Y.F. All authors have read
and agreed to the published version of the manuscript.
Data Availability Statement: The data presented in this study are available in the UC Irvine Machine
Learning Repository at 10.24432/C5PK67.
References
1. Griffiths, S.; Sovacool, B.K.; Furszyfer Del Rio, D.D.; Foley, A.M.; Bazilian, M.D.; Kim, J.; Uratani, J.M. Decarbonizing the Cement
and Concrete Industry: A Systematic Review of Socio-Technical Systems, Technological Innovations, and Policy Options. Renew.
Sustain. Energy Rev. 2023, 180, 113291. [CrossRef]
Infrastructures 2025, 10, 26 25 of 26
2. Young, B.A.; Hall, A.; Pilon, L.; Gupta, P.; Sant, G. Can the Compressive Strength of Concrete Be Estimated from Knowledge of
the Mixture Proportions?: New Insights from Statistical Analysis and Machine Learning Methods. Cem. Concr. Res. 2019, 115,
379–388. [CrossRef]
3. Li, Z.; Yoon, J.; Zhang, R.; Rajabipour, F.; Srubar, W.V., III; Dabo, I.; Radlińska, A. Machine Learning in Concrete Science:
Applications, Challenges, and Best Practices. npj Comput. Mater. 2022, 8, 127. [CrossRef]
4. Alghrairi, N.S.; Aziz, F.N.; Rashid, S.A.; Mohamed, M.Z.; Ibrahim, A.M. Machine Learning-Based Compressive Strength
Estimation in Nanomaterial-Modified Lightweight Concrete. Open Eng. 2024, 14, 20220604. [CrossRef]
5. Ding, Y.; Wei, W.; Wang, J.; Wang, Y.; Shi, Y.; Mei, Z. Prediction of Compressive Strength and Feature Importance Analysis of Solid
Waste Alkali-Activated Cementitious Materials Based on Machine Learning. Constr. Build. Mater. 2023, 407, 133545. [CrossRef]
6. Ekanayake, I.U.; Meddage, D.P.P.; Rathnayake, U. A Novel Approach to Explain the Black-Box Nature of Machine Learning in
Compressive Strength Predictions of Concrete Using Shapley Additive Explanations (SHAP). Case Stud. Constr. Mater. 2022, 16,
e01059. [CrossRef]
7. Paudel, S.; Pudasaini, A.; Shrestha, R.K.; Kharel, E. Compressive Strength of Concrete Material Using Machine Learning
Techniques. Clean. Eng. Technol. 2023, 15, 100661. [CrossRef]
8. Song, H.; Ahmad, A.; Farooq, F.; Ostrowski, K.A.; Maślak, M.; Czarnecki, S.; Aslam, F. Predicting the Compressive Strength of
Concrete with Fly Ash Admixture Using Machine Learning Algorithms. Constr. Build. Mater. 2021, 308, 125021. [CrossRef]
9. Quan Tran, V.; Quoc Dang, V.; Si Ho, L. Evaluating Compressive Strength of Concrete Made with Recycled Concrete Aggregates
Using Machine Learning Approach. Constr. Build. Mater. 2022, 323, 126578. [CrossRef]
10. Ahmad, A.; Ahmad, W.; Chaiyasarn, K.; Ostrowski, K.A.; Aslam, F.; Zajdel, P.; Joyklad, P. Prediction of Geopolymer Concrete
Compressive Strength Using Novel Machine Learning Algorithms. Polymers 2021, 13, 3389. [CrossRef] [PubMed]
11. Anjum, M.; Khan, K.; Ahmad, W.; Ahmad, A.; Amin, M.N.; Nafees, A. Application of Ensemble Machine Learning Methods to
Estimate the Compressive Strength of Fiber-Reinforced Nano-Silica Modified Concrete. Polymers 2022, 14, 3906. [CrossRef]
12. Ullah, H.S.; Khushnood, R.A.; Farooq, F.; Ahmad, J.; Vatin, N.I.; Ewais, D.Y.Z. Prediction of Compressive Strength of Sustainable
Foam Concrete Using Individual and Ensemble Machine Learning Approaches. Materials 2022, 15, 3166. [CrossRef]
13. Kumar, P.; Pratap, B. Feature Engineering for Predicting Compressive Strength of High-Strength Concrete with Machine Learning
Models. Asian J. Civ. Eng. 2024, 25, 723–736. [CrossRef]
14. Nguyen, N.-H.; Abellán-García, J.; Lee, S.; Vo, T.P. From Machine Learning to Semi-Empirical Formulas for Estimating Compres-
sive Strength of Ultra-High Performance Concrete. Expert Syst. Appl. 2024, 237, 121456. [CrossRef]
15. Onyelowe, K.C.; Gnananandarao, T.; Ebid, A.M.; Mahdi, H.A.; Ghadikolaee, M.R.; Al-Ajamee, M. Evaluating the Compressive
Strength of Recycled Aggregate Concrete Using Novel Artificial Neural Network. Civ. Eng. J. 2022, 8, 1679–1693. [CrossRef]
16. Onyelowe, K.C.; Ebid, A.M.; Mahdi, H.A.; Riofrio, A.; Eidgahee, D.R.; Baykara, H.; Soleymani, A.; Kontoni, D.-P.N.; Shak-
eri, J.; Jahangir, H. Optimal Compressive Strength of RHA Ultra-High-Performance Lightweight Concrete (UHPLC) and Its
Environmental Performance Using Life Cycle Assessment. Civ. Eng. J. 2022, 8, 2391–2410. [CrossRef]
17. Onyelowe, K.C.; Kontoni, D.-P.N.; Ebid, A.M.; Dabbaghi, F.; Soleymani, A.; Jahangir, H.; Nehdi, M.L. Multi-Objective Optimization
of Sustainable Concrete Containing Fly Ash Based on Environmental and Mechanical Considerations. Buildings 2022, 12, 948.
[CrossRef]
18. ACI Committee 318; American Concrete Institute. Building Code Requirements for Structural Concrete (ACI 318-08) and Commentary;
American Concrete Institute: Farmington Hills, MI, USA, 2008; ISBN 978-0-87031-264-9.
19. Yeh, I.-C. Concrete Compressive Strength. UCI Mach. Learn. Repos. 2007, 10, C5PK67.
20. Mckinney, W. Pandas: A Foundational Python Library for Data Analysis and Statistics. Python High Perform. Sci. Comput. 2011,
14, 1–9.
21. Vinutha, H.P.; Poornima, B.; Sagar, B.M. Detection of Outliers Using Interquartile Range Technique from Intrusion Dataset. In
Information and Decision Sciences, Proceedings of the 6th International Conference on FICTA, Bhubaneswar, India, 14–16 October 2017;
Satapathy, S.C., Tavares, J.M.R.S., Bhateja, V., Mohanty, J.R., Eds.; Springer: Singapore, 2018; pp. 511–518.
22. Tukey, J.W. Exploratory Data Analysis; Addison-Wesley Pub. Co.: Reading, MA, USA, 1977; ISBN 978-0-201-07616-5.
23. Hunter, J.D. Matplotlib: A 2D Graphics Environment. Comput. Sci. Eng. 2007, 9, 90–95. [CrossRef]
24. American Concrete Institute. Building Code Requirements for Structural Concrete (ACI 318-19) and Commentary; American Concrete
Institute: Farmington Hills, MI, USA, 2019.
25. McKinney, W. Data Structures for Statistical Computing in Python. Proc. Python Sci. Conf. 2010, 445, 56–61.
26. Waskom, M.L. Seaborn: Statistical Data Visualization. J. Open Source Softw. 2021, 6, 3021. [CrossRef]
27. O’brien, R.M. A Caution Regarding Rules of Thumb for Variance Inflation Factors. Qual. Quant. 2007, 41, 673–690. [CrossRef]
28. Hover, K.C. The Influence of Water on the Performance of Concrete. Constr. Build. Mater. 2011, 25, 3003–3013. [CrossRef]
29. Hashemi, M.; Shafigh, P.; Karim, M.R.B.; Atis, C.D. The Effect of Coarse to Fine Aggregate Ratio on the Fresh and Hardened
Properties of Roller-Compacted Concrete Pavement. Constr. Build. Mater. 2018, 169, 553–566. [CrossRef]
Infrastructures 2025, 10, 26 26 of 26
30. Iqbal Khan, M.; Abbass, W.; Alrubaidi, M.; Alqahtani, F.K. Optimization of the Fine to Coarse Aggregate Ratio for the Workability
and Mechanical Properties of High Strength Steel Fiber Reinforced Concretes. Materials 2020, 13, 5202. [CrossRef] [PubMed]
31. Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016.
32. Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.;
et al. Scikit-Learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830.
33. Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning; Springer Series in Statistics; Springer: New York, NY,
USA, 2009; ISBN 978-0-387-84857-0.
34. Massey, F.J., Jr. The Kolmogorov-Smirnov Test for Goodness of Fit. J. Am. Stat. Assoc. 1951, 46, 68–78. [CrossRef]
35. Abadi, M.; Barham, P.; Chen, J.; Chen, Z.; Davis, A.; Dean, J.; Devin, M.; Ghemawat, S.; Irving, G.; Isard, M.; et al. {TensorFlow}: A
System for {Large-Scale} Machine Learning. In Proceedings of the 12th USENIX symposium on operating systems design and
implementation (OSDI 16), Savannah, GA, USA, 2–4 November 2016; pp. 265–283.
36. Chollet, F. Fchollet/Keras-Resources. Available online: https://github.com/fchollet/keras-resources (accessed on 20 November 2024).
37. Bayesian Optimization in Action. Available online: https://www.manning.com/books/bayesian-optimization-in-action
(accessed on 15 January 2025).
38. Snoek, J.; Larochelle, H.; Adams, R.P. Practical Bayesian Optimization of Machine Learning Algorithms. Adv. Neural Inf. Process.
Syst. 2012, 25. Available online: https://arxiv.org/abs/1206.2944 (accessed on 15 January 2025).
39. Taylor, K.E. Summarizing Multiple Aspects of Model Performance in a Single Diagram. J. Geophys. Res. Atmos. 2001, 106,
7183–7192. [CrossRef]
40. Powers, D.M.W. Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness and Correlation. arXiv
2020, arXiv:2010.16061.
41. Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [CrossRef]
42. Lundberg, S.; Lee, S.-I. A Unified Approach to Interpreting Model Predictions. arXiv 2017, arXiv:1705.07874.
43. Friedman, J.H. Greedy Function Approximation: A Gradient Boosting Machine. Ann. Stat. 2001, 29, 1189–1232. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.