Duration and Cost Variability of Construction Activities: An Empirical Study
Duration and Cost Variability of Construction Activities: An Empirical Study
An Empirical Study
P. Ballesteros-Pérez, Ph.D. 1; E. Sanz-Ablanedo, Ph.D. 2; R. Soetanto, Ph.D. 3;
Ma. C. González-Cruz, Ph.D. 4; G. D. Larsen, Ph.D. 5; and A. Cerezo-Narváez, Ph.D. 6
Downloaded from ascelibrary.org by University of Western Ontario on 11/17/19. Copyright ASCE. For personal use only; all rights reserved.
Abstract: The unique nature of construction projects can mean that construction activities often suffer from duration and cost variability.
Because this variability is unplanned, it can present a problem when attempting to complete a project on time and on budget. Various factors
causing this variability have been identified in the literature, but they predominantly refer to the nature and/or context of the whole project
rather than specific activities. In this paper, the order of magnitude of and correlation between activity duration and cost variability is analyzed
in 101 construction projects with over 5,000 activities. To do this, the first four moments (mean, standard deviation, skewness, and kurtosis) of
actual versus planned duration and cost (log) ratios are analyzed by project, phase of execution, and activity type. Results suggest that,
contrary to common wisdom, construction activities do not end late on average. Instead, the large variability in the activity duration is
the major factor causing significant project delays and cost overruns. The values of average activity duration and cost variability gathered
in this study will also serve as a reference for construction managers to improve future construction planning and project simulation studies
with more realistic data. DOI: 10.1061/(ASCE)CO.1943-7862.0001739. © 2019 American Society of Civil Engineers.
Author keywords: Scheduling; Activity variability; Merge event bias; Network topology; Project delays.
(1964), trying to find a universal distribution that fits all types of the final project duration. In this regard, a recent study by Hajdu
activities is a futile effort because each type of activity is unique. and Bokor (2014) concluded that the maximum project duration
Furthermore, its context might also have a significant influence that deviation when using alternative activity distributions was generally
is difficult, if not impossible, to parameterize mathematically. well below 10%. This finding resonated with observations from an
Nonetheless, these difficulties should not be a deterrent to at earlier study on the limitations of PERT. MacCrimmon and Ryavec
least attempting to measure the average level of variability of con- (1964) showed that, if triangular distributions for modeling activity
struction activity durations and costs. As argued, this would be an durations had been chosen instead of beta distributions, the prob-
extremely valuable input for future project duration and cost fore- abilistic project duration would have produced almost identical
casting techniques, as well as providing powerful baseline informa- results.
tion for enhancing project control and monitoring. The reason why the choice of a particular statistical distribution
Hence, the present paper precisely attempts to fill this research does not seem that relevant is because the third and fourth moments
gap in the construction management literature by measuring the (skewness and kurtosis) are blurred very quickly in stochastic net-
average level of activity duration and cost variability. It will also work analysis (SNA) (Hajdu and Bokor 2016), which is currently
justify how and why, given this level of activity variability in com- considered the most accurate approach to model project schedule
networks (Ballesteros-Pérez 2017b). In SNA, activity durations and
mon project networks, it is expected that most construction projects
costs are modeled by statistical distributions (with or without cor-
end late and go over budget. To achieve this, the actual/planned
relation with each other). More precisely, distributions are summed
(log) ratios of many project and activity durations and costs will
when computing the total costs of activities or the total duration of
be analyzed. The correlation between activity durations and costs
activities arrayed in series. On the other hand, the maximum of dis-
will also be studied. Finally, the most common network topologies
tributions (instead of a sum) is calculated whenever one calculates
(descriptors of what the project networks are like, that is, how ac-
the total duration of a set of activities placed in parallel. In either
tivities are arranged and connected with each other) will be sum-
case, the third and fourth moments (skewness and kurtosis) have a
marized, and the potential impact of activity variability on these
minor influence on the resulting distribution of a path or project
networks are described in detail. duration.
The paper will be structured as follows. The “Background” However, the first two moments (mean and variance, or alter-
section will provide an overview of the importance of the first four natively, standard deviation) play a major role in the resulting dis-
moments of the activity duration and costs impacting the final tribution modeling the total project duration. When there is some
project duration and cost. This section will introduce the concept correlation between durations and costs (virtually always in con-
of merge event bias and describe how it may cause project delays struction projects), they also have an indirect but still significant
and cost overruns depending of each project network topology. The influence on the final project cost.
“Materials and Methods” section will describe how a data set of To sum up, when two or more distributions are convoluted
101 projects was classified according to different activity catego- (summed for computing the project cost or the duration of activities
ries, and then their log actual versus planned durations and cost in series) the resulting distribution, by the central limit theorem,
deviations analyzed activity by activity. The “Discussion” section quickly converges to a normal distribution. The mean and variance
will provide insights into what the numerical results mean and how of this normal distribution correspond to the sum of means and var-
they are connected to the project network topology in common con- iances, respectively, of the individual activity distributions. There-
struction projects. Finally, the “Conclusions” section will summa- fore, the first two moments will mostly determine what the resulting
rize the whole analysis, highlight the major contributions to the distribution looks like.
body of knowledge, state the study limitations, and propose future When some activities are arranged in parallel and they all need
research continuations. to finish before the project can continue, the resulting distribution
quickly converges to an extreme value distribution of maxima (nor-
mally a Fréchet or a Gumbel distribution) (Dodin and Sirvanci
Background 1990). Again, the first two moments of the involved activity dis-
tributions will determine the location and scale of the resulting ex-
There have been numerous studies analyzing delays and cost over- treme value distribution. This phenomenon is commonly known as
runs in construction projects at project level (e.g., Hamzah et al. the merge event bias (Khamooshi and Cioffi 2013; Vanhoucke
2011; Keane and Caletka 2008; Mahamid et al. 2012; Ogunlana 2012), and it is indeed the major source of inaccuracy of all deter-
et al. 1996; Orangi et al. 2011; Senouci et al. 2016). Most studies ministic scheduling techniques.
have focused on either establishing the causes of delays and cost Real construction project schedules (networks) generally in-
overruns and/or proposing some regression analyses to avoid slip- volve many subsets of activities both arranged in parallel and in
pages in the future. Generally, these studies have been aligned with series. Hence, multiple convolutions (sums) and maxima of distri-
a more reductionist perspective, seeking to emphasize a particular butions need to be computed so that the final project duration and
order strength (Mastor 1970) and the complexity index (Bein et al. and specific project information can also be found as individual
1992). However, these only capture the project complexity and will project cards at OR-AS.be (2018).
not be used here.
Instead, this study will make use of four topology measures
that describe the structure of an activity-on-the-node network, not Analysis Outline
just its complexity. These measures were initially proposed by This analysis focuses first on the activity-level deviations of
Valadares Tavares et al. (1999) and later improved by Vanhoucke durations and costs. Project-level data will also be analyzed sub-
(2008). The four measures (also named indicators) used are: serial- sequently, but from a complementary point of view to activities
parallel (SP) indicator, activity distribution (AD), length of arcs analyses. The activity duration and cost deviations are calculated
(LA) indicator, and topological float (TF) which will be explained for each activity i in the first data set according to the following
in the following sections. All these indicators range between 0 two expressions, respectively:
and 1 and constitute simple measures describing to what extent the
first two moments of the construction activities may condition the Activity duration deviation of activity i
final duration and cost of a project.
Actual duration of activity i
¼ LOG10 ð1Þ
Planned duration of activity i
Materials and Methods
Activity cost deviation of activity i
In this section, the characteristics of the projects and activity data
sets analyzed are described. Details of how the activity and project Actual cost of activity i
¼ LOG10 ð2Þ
data were filtered and categorized under multiple levels of analysis Planned cost of activity i
are also presented. Next, the first four moments of activity durations
and costs are reported and commented upon separately. Finally, the It is worth emphasizing that both of these ratios are expressed in
correlations between activity durations and costs are reported along logarithmic scale. This is important because ratios of variables that
with their statistical significance. are always positive (e.g., durations and costs) are not symmetrical
respect to the value 1. The scale distortion of these ratios (they
range between 0 and 1 when the denominator is bigger than the
Projects and Activities Data Set
numerator, but between 1 and +infinity when the numerator is big-
This research used two different project data sets. The first (and ger than the denominator) creates an artificial positive skewness in
main) one is analyzed at both activity and project levels. The sec- the data distribution that can only be removed by taking the log
ond data set contains project-level information (planned and actual ratios beforehand. Additionally, in log scale, the variable variances
project durations and costs) and will be used for illustrative pur- are additive, rather than multiplicative.
poses in the “Discussion” section. Therefore, this study will take the logarithm of every ratio before
In order to obtain representative values of the first four moments analyzing their activity duration and cost moments. Logarithms with
of the activity durations and costs, a significant amount of activities base 10 were used because their orders of magnitude are a little more
is necessary. In the first data set, 101 construction projects are an- familiar, but any other base would have been possible.
alyzed initially encompassing 5,697 activities. All these activities Lastly, ratios in natural scale from 0 to 1 correspond to values
analysis can be found in the Supplemental Data. from -infinity to 0 in any log scale, whereas ratios in natural scale
Projects are classified in four types: building, civil engineer- from 1 to +infinity correspond to the (0; þ∞Þ range. Both ranges
ing, industrial, or services. Building projects are mostly aimed at also have a symmetrical correspondence with each other in log
constructing a building or parts of a building. Civil engineering re- scale (e.g., ratios 1=2 and 2 in natural scale have the same values
fers to infrastructure construction in general. Industrial projects re- with opposite signs in log scale, that is −0.301 and 0.301, respec-
fer to installations and/or electromechanical equipment. Services tively) which makes the interpretations of variability results easier.
refer to projects with a significant operational and/or production Bearing this in mind, the next step consists of describing how
component. the activities were grouped to analyze their ratios and produce
The 101-project data set was retrieved from a real projects data robust results. The progressive classification levels can be found
set originally developed by Batselier and Vanhoucke (2015) and in Tables 2–5.
Vanhoucke et al. (2016). Although the exact locations of those proj- In every table, three levels of activity classifications are pre-
ects were not disclosed in most cases (due to a confidentiality clause sented. Each level consists of three types of activities:
with the information donors), it is known that most of them occurred • Planned and performed (P&P): these activities correspond to ac-
in Belgium, the Netherlands, Italy, the US, and Azerbaijan. tivities that were initially planned and were also finally executed
C2013-02 Sewage plant Hove Civil engineering 1,236,603.66 1,146,444.38 403 408 175 12 38 0 62
C2013-03 Brussels Finance Tower Building 15,440,865.89 16,338,027.20 425 426 55 3 82 0 87
C2013-04 Kitchen Tower Anderlecht Building 2,113,684.00 2,512,524.00 333 453 244 47 59 0 63
C2013-05 PET packaging Service 874,554.28 874,554.28 521 632 28 14 69 0 80
C2013-06 Government office building Building 19,429,810.51 21,546,846.18 352 344 275 10 36 0 34
C2013-07 Family residence Building 180,476.47 175,030.65 170 174 46 40 44 3 25
C2013-08 Timber house Building 501,029.51 576,624.05 216 235 41 29 42 0 47
C2013-09 Urban development project Civil engineering 1,537,398.51 1,696,971.79 291 360 71 34 51 6 16
C2013-10 Town square Civil engineering 11,421,890.36 15,218,926.38 786 785 186 18 36 0 62
C2013-11 Recreation complex Building 5,480,518.91 5,451,028.00 359 277 159 27 44 0 32
C2013-12 Young cattle barn Building 818,439.99 879,853.17 115 188 27 64 77 6 54
C2013-13 Office finished works (1) Building 1,118,496.59 955,929.22 236 217 11 20 49 33 6
C2013-14 Office finished works (2) Building 85,847.89 75,468.30 80 88 9 62 80 66 47
C2013-15 Office finished works (3) Building 341,468.11 308,343.78 171 115 17 25 43 21 35
C2013-16 Office finished works (4) Building 248,203.92 198,567.00 196 108 7 33 62 0 75
C2013-17 Office finished works (5) Building 244,205.40 203,605.97 161 107 23 36 38 20 32
C2014-01 Mixed-use building Building 38,697,822.73 39,777,643.30 474 448 41 50 38 3 49
C2014-02 Playing cards Industrial 191,492.70 190,266.50 124 146 21 81 94 0 14
C2014-03 Organizational development Service 43,170.15 83,712.15 229 260 112 9 31 0 36
C2014-04 Compression Station Zelzate Industrial 62,385,597.58 65,526,930.04 522 844 24 95 100 0 100
C2014-05 Apartment building (1) Building 532,410.29 591,410.53 228 274 25 58 71 35 18
C2014-06 Apartment building (2) Building 3,486,375.47 3,599,114.11 547 611 29 57 75 46 15
C2014-07 Apartment building (3) Building 1,102,536.78 1,289,696.78 353 404 25 58 71 35 18
C2014-08 Apartment building (4) Building 1,992,222.09 2,380,299.86 233 275 39 44 29 11 14
C2015-01 Young cattle barn (2) Building 612,769.44 646,473.65 131 210 27 57 73 0 46
C2015-02 Railway station (1) Civil engineering 1,121,316.94 967,988.79 417 501 216 8 66 1 80
C2015-03 Industrial complex (1) Building 2,244,090.74 1,868,796.28 257 278 135 16 43 0 58
C2015-04 Apartment building (5) Building 2,750,938.00 2,590,796.73 160 205 56 27 37 0 57
C2015-06 Family residence (2) Building 143,673.20 186,107.00 260 290 184 18 0 30 38
C2015-07 Industrial complex (2) Building 5,999,600.00 5,414,544.00 297 313 138 27 38 0 49
C2015-08 Garden center Building 467,297.21 461,900.17 191 186 186 14 52 0 79
C2015-09 Railway station (2) Civil engineering 1,457,424.00 2,145,682.26 354 569 340 4 48 0 75
C2015-10 Tax return system (1) Service 18,990.00 8,010.00 85 85 15 10 82 23 21
C2015-11 Staff authorized system Service 14,400.00 9,105.00 55 55 7 25 66 0 52
C2015-12 Premium payment system Service 132,570.00 58,410.00 184 184 35 19 63 9 61
C2015-13 Broker Access Convection System Service 12,735.00 9,990.00 117 117 16 19 60 7 51
C2015-14 Superior Pensions Database Service 34,260.00 18,285.00 124 124 17 17 55 3 50
C2015-15 FACTA System Service 11,700.00 7,035.00 57 57 13 22 57 8 18
C2015-16 Generic document output system Service 64,620.00 64,125.00 270 270 22 10 61 12 26
C2015-17 Insurance bundling system Service 281,430.00 281,070.00 208 236 86 6 77 8 41
C2015-18 Tax return system (2) Service 39,450.00 25,380.00 128 128 15 10 66 16 11
C2015-19 Receipt number system Service 43,800.00 37,530.00 182 182 20 21 46 8 31
C2015-20 Policy numbering system Service 12,645.00 11,100.00 171 161 6 20 62 20 13
C2015-21 Investment product (1) Service 4,020.00 3,240.00 37 37 12 18 35 2 36
C2015-22 Risk profile questionnaire Service 29,880.00 17,400.00 151 151 22 16 70 9 40
C2015-23 Investment product (2) Industrial 46,920.00 32,805.00 122 120 33 17 53 5 39
C2015-24 CRM system Service 44,130.00 36,870.00 233 233 21 7 59 7 29
C2015-25 Beer tasting Service 1,210.00 1,780.00 14 14 18 16 40 21 19
C2015-26 Debt collection system Service 458,112.37 512,546.15 148 154 214 9 43 0 61
C2015-27 Railway station Antwerp Building 22,703.52 25,313.12 68 81 18 23 40 −2 54
C2015-28 Website tennis Vlaanderen Service 219,275.00 382,475.00 201 212 20 15 54 0 67
C2015-29 Fire station Building 1,874,496.82 1,887,087.25 284 298 204 48 34 0 41
C2015-30 Social apartments Ypres (1) Building 440,940.89 440,940.89 244 254 40 25 51 −1 76
C2015-31 Social apts Ypres (2) Building 1,310,723.46 1,282,185.98 271 364 29 32 49 23 43
C2015-32 Social apts Ypres (3) Building 2,509,031.42 2,509,031.42 358 265 48 38 63 3 59
C2015-33 IJzertoren Memorial Square Civil engineering 214,417.71 224,789.67 50 94 12 63 57 0 14
Table 2. Total number of activities analyzed: Level 0 (all activities from all necessary and had to be eventually carried out. These activities
projects) were removed from the analysis because their ratios converged
Planned and Unplanned but Planned but not to + infinity (because the planned values in the denominators
performed performed performed equal 0), and because most of the time they come from planning
mistakes or omissions.
5,289 279 129
• Planned but not performed (PBNP). These activities corre-
spond to activities that were initially planned, but that were not
in the projects analyzed. These are the most frequent and the executed in the end. These activity ratios would equal zero in nat-
only ones that are considered in the analysis. ural scale but their logarithmic values would converge to − infinity.
• Unplanned but performed (UBP): these activities correspond to They also represent bad estimates of the planned schedule like
activities that were not initially planned but that were deemed UBP activities; hence, they were also removed from the analysis.
Concerning activity grouping, four levels of analysis (0–3) were It also reflects that execution activities belonging to services pro-
considered: jects, despite being higher in number, were found to be too hetero-
• Level 0 (Table 2) contains all activities analyzed from all pro- geneous. The latter made it hard to classify these activities within
jects. This allows drawing general average conclusions without similar self-contained categories (services projects are indeed much
paying attention to proportions nor types of those activities. more varied regarding the nature of their activities).
• Level 1 (Table 3). Here, activities are classified under the same
four types of projects stated in Table 1 (building, civil engineer-
Activity Duration Results
ing, industrial, and services). As expected, this level allows ana-
lyzing how the activity durations and costs deviations differ by The first four moments (average, standard deviation, skewness, and
(generic) types of projects. Some group average and dispersion kurtosis) of the activities log ratios were analyzed according to the
results of activity durations and costs are also included for four levels described in Tables 2–5. Table 6 presents the results for
reference on the right columns. the activity duration log ratios [log10 (actual/planned)].
• Level 2 (Table 4). Within the previous four project type For each case and level analyzed, four numerical values are
categories, activities are further classified into three standard displayed: n (sample size, that is, the number of activities used to
phases of the every project life cycle according to the pro- calculate the four moments), and the four moment values (in log-
ject management body of knowledge (PMBoK): planning, arithmic scale). However, due to the major relevance of the first two
execution, and closure (Project Management Institute 2017). moments (average and standard deviation), these two have also
Classifying activities into these three categories is straightfor- been included in natural scale within parentheses right below their
ward with the activity descriptions available in almost all pro- respective logarithmic values. Values in natural scale are expected
jects. The fourth phase considered by the PMBoK (monitoring to help the reader to better grasp the order of magnitude of these
and control) is not relevant for this analysis and therefore not moments. With this information, Table 6 is self-explanatory.
considered. The number of readings and details in this table are numerous,
• Level 3 (Table 5). For the execution phase of building and civil so attention is given here to the most relevant findings.
engineering projects, only activities are further classified into Concerning averages, it is striking to observe how most values
five generic groups, called here activity types (auxiliary works, remain very close to 0 (in log values) or 1 (in natural values). Some
substructure, superstructure, specialized works, and facilities). exceptions may be services projects and the planning-phase activ-
These are also common and relatively straightforward groups ities (Level 2) from building and civil engineering projects. Yet, in
of activities in most construction projects. A more detailed de- the latter, average ratios values remain close to 5% (in log values)
scription of the scope of each group the reader has been given by or 11% (in natural values). Overall, as these log ratios are so close
Chudley and Greeno (2016). to zero; this suggests that construction activities do not end late (on
Level 3 allowed classifying activities into one last level right above average). This may be an unexpected finding because the easier
the nature of the activity itself. Activities in this level were classi- explanation for projects ending late was that its activities ended late
fied mostly thanks to the descriptions of the project summary ac- on average. This result seems to suggest the problem lies some-
tivities (that were indeed not used for anything else in the analysis). where else.
Finally, as highlighted at the beginning of Level 3’s description, Concerning the standard deviation (SD) values, results are
only activities from the execution phase of building and civil very different. SD, by definition, can only be positive, but it is
engineering projects were used. This is due to the number of ex- quite clear that, unlike the averages, SDs are not close to zero. In-
ecution activities in industrial projects being considered too low. stead, with a few exceptions, SD values are almost always above
© ASCE
n Average SD Skewness Kurtosis Type n Average SD Skewness Kurtosis Phase n Average SD Skewness Kurtosis Type n Average SD Skewness Kurtosis
5,289 0.010 0.19 0.91 9.90 Building 2,894 0.004 0.15 −0.36 9.88 Planning 49 0.035 0.21 1.51 8.13 Insufficient data sample
(1.023) (1.56) (1.009) (1.43) (1.083) (1.62)
Execution 2,810 0.003 0.15 −0.46 9.92 Auxiliary 139 0.017 0.16 0.36 5.54
(1.007) (1.42) works (1.040) (1.46)
Substructure171 0.035 0.14 1.89 8.38
(1.083) (1.39)
Superstructure 654 −0.018 0.16 −0.98 9.60
(0.960) (1.45)
Specialized 1,272 0.004 0.15 −0.87 10.02
works (1.010) (1.42)
Facilities 574 0.011 0.14 0.53 11.71
(1.026) (1.39)
Closure 35 0.022 0.16 1.33 3.91 Insufficient data sample
(1.052) (1.45)
Civil 1,092 −0.008 0.20 0.53 9.61 Planning 38 0.052 0.18 2.38 12.88 Insufficient data sample
Engineering (0.982) (1.58) (1.126) (1.53)
Execution 1,034 −0.010 0.20 0.49 9.36 Auxiliary 207 0.013 0.13 1.75 9.43
(0.977) (1.58) works (1.030) (1.36)
Substructure 229 −0.005 0.18 −0.39 8.57
(0.990) (1.53)
Superstructure 257 −0.030 0.20 0.90 10.73
(0.934) (1.57)
Specialized 264 −0.012 0.24 0.28 5.81
04019093-7
works (0.972) (1.72)
Facilities 77 −0.018 0.26 1.09 10.65
(0.959) (1.82)
Closure 20 −0.011 0.05 −4.47 20.00 Insufficient data sample
(0.975) (1.12)
and, combined with averages also close to zero, one can conclude cost variation. For this aim, all activities were grouped under the
that there is approximately the same probability of finding early very same levels previously described, and linear correlations were
activities than tardy activities. calculated among the duration log ratios and the cost log ratios. A
Concerning kurtosis, the picture is very different. Values are summary of this analysis is presented in Table 8. Spearman’s rho
generally well above 3, which would describe the kurtosis corre- and Kendall’s tau nonlinear (rank) correlations were also tested.
sponding to the normal distribution. This result means that log ratio However, they only very marginally improved the linear correlation
duration values resemble a peaked distribution with heavy tails. In results and were considered not worth including because they did
other words, the majority of the actual durations are not close to not seem to barely depart from the linear case given in Table 8.
their planned values. As stated previously, many other readings Table 8 is divided in two major blocks. The upper block is de-
may be extracted from Table 6. However, for the sake of clarity, voted to activity-level correlations. The lower block is reserved for
only the most relevant high-level interpretations are presented. project-level correlations. For each correlation, it has been specified
how many data points were used (column labeled n), Pearson’s cor-
relation coefficient (R), and coefficient of determination (R2 ), along
Activity Cost Results
with the gradient (slope column) and intercept of the linear regres-
Table 7 represents the first four moments of the activity actual sion lines. Statistically highly significant correlations have been
versus planned cost log ratios. Parentheses contain the antilogarith- identified separately for R2 tests (with the Snedecor’s F distribu-
mic (natural scale) values of the first two moments as well. Table 7 tion) and slope tests (with the Student’s T distribution). Significant
values differ substantially from those found in Table 6. statistical correlations have also been indicated.
Concerning average values, most of them are clearly positive In the case of activity-level correlations, almost all correlation
and generally above 1.01 (in log values) or alternatively above 3% values are significant. This mean that values of R2 are very unlikely
(in natural scale). A clear exception may be industrial projects, to have happened by chance. This is not the case at project-level
whose average is negative. This may be because industrial projects correlations, where, apart from the Level 0 of analysis (all 101 proj-
are frequently composed of electromechanical equipment whose ects grouped together), R2 values have not been found to be statisti-
procurement prices are relatively easier to estimate more accurately cally significant. This means one cannot count on the reliability
ex ante than other types of projects. Additionally, civil engineering of project-level duration-cost correlations, and hence they will be
and services projects are among the ones whose activities tend to ignored moving forward.
suffer from more cost overruns. This may be due to civil engineer- Correlations at the activity level do offer very interesting results.
ing projects being (generally) less standard than buildings, whose R and R2 evidence weak to moderate correlations (R2 ranging be-
average log ratios remain closer to 0. On the other hand, services tween 0.10 and 0.62), but the slopes of such correlations are rather
projects, as indicated in Table 6, suffered from more delays on aver-
close to 0.50 in some levels and almost all of them are significant.
age than other types of projects. Because these types of projects
More precisely, when there is no differentiation among activities
frequently more labor-intensive, it seems logical that those extra
(Level 0), the slope is as high as 0.704. This means that a 100%
durations are correlated with these extra costs.
activity duration extension (in log scale) would cause a 70.4% cost
Concerning SD, variability is even more evident than in the case
of duration log ratios. On Level 0, one can appreciate how the aver- increment on that activity. This is quite a high gradient.
age activity SD reaches 0.25 (78% of variability in natural scale). Differentiating by project type (Level 1), the slopes become
On Level 1, no project type has a variability below 0.16 (46% of more informative. Building and civil engineering projects boast
variability in the case of building projects), and two of them (civil a gradient close to 0.5, that is, every 100% of duration increment
engineering and services) remain above 0.30 (>100% of variabil- is likely to cause a 50% of cost increment for that activity. The other
ity). SDs on Levels 2 and 3 offer similar readings but with wider two types of projects have no statistically significant slopes; how-
values. ever, it seems clear that industrial projects (probably due to the
Concerning skewness, cost log ratios are more varied than their higher component of electromechanical equipment in the project
duration counterparts. In general, when average values are negative, budget) have lower slopes. On the contrary, services projects, being
the skewness values are also predominantly negative. Similarly, more labor-intensive, have higher slopes.
when the average costs are positive, the cost distribution is also Results by project phase (Level 2) seem more homogeneous.
positively skewed. However, only the execution activities’ slope is statistically signifi-
Concerning kurtosis, values are much higher than its duration cant. This level of correlation seems to replicate the results previ-
ratios counterpart too. This would be indicative again that most ac- ously provided for Level 0.
tivity actual costs substantially differ from their planned values Results at Level 3 are again not that heterogeneous, and they all
(a high proportion of the actual costs tend to be substantially differ- are statistically significant. However, there is nothing remarkable
ent from their planned costs). that has not been highlighted previously.
© ASCE
n Average SD Skewness Kurtosis Type n Average SD Skewness Kurtosis Phase n Average SD Skewness Kurtosis Type n Average SD Skewness Kurtosis
5289 0.031 0.25 2.49 15.56 Building 2894 0.015 0.16 2.02 25.27 Planning 49 −0.002 0.19 −1.66 10.45 Insufficient data sample
(1.074) (1.78) (1.035) (1.46) (0.996) (1.56)
Execution 2810 0.015 0.16 2.12 25.91 Auxiliary works 139 0.027 0.11 2.73 14.82
(1.035) (1.46) (1.065) (1.29)
Substructure 171 0.014 0.10 −0.21 8.37
(1.034) (1.26)
Superstructure 654 0.010 0.10 0.01 10.14
(1.023) (1.26)
Specialist 1272 0.014 0.21 2.12 18.98
works (1.034) (1.62)
Facilities 574 0.020 0.12 0.53 13.85
(1.046) (1.33)
Closure 35 0.041 0.16 2.01 6.60 Insufficient data sample
(1.098) (1.43)
Civil engineering 1092 0.057 0.30 1.78 6.28 Planning 38 0.322 0.43 0.99 −0.75 Insufficient data sample
(1.139) (2.01) (2.099) (2.71)
Execution 1034 0.048 0.30 1.77 6.87 Auxiliary works 207 0.059 0.32 2.89 11.31
(1.116) (1.98) (1.147) (2.08)
Substructure 229 0.057 0.30 0.63 2.26
(1.140) (1.98)
Superstructure 257 0.057 0.32 1.40 3.11
(1.141) (2.11)
Specialist 264 0.016 0.24 1.93 13.70
04019093-9
works (1.038) (1.74)
Facilities 77 0.067 0.31 2.03 7.66
(1.166) (2.04)
Closure 20 0.011 0.01 −0.95 −1.24 Insufficient data sample
(1.026) (1.02)
Fig. 1. Duration and cost overrun probability distribution of 746 road construction projects from the Florida Department of Transportation.
ities add to the total project variability (beyond the activity duration
5,289 activities, plus another set with 746 projects, have been used. and cost variability analyzed here). In the present analysis, however,
The first contribution of this study is providing construction there were only 279 UBP þ 129 PBU ¼ 408 activities out of the
managers with a first, yet rather complete, set of actual-versus- initial 5,697 (7% in total). Hence, although the authors believe the
planned average activity durations and costs deviations with appli- influence of UBP and PBU activities needs to be duly investigated,
cation in multiple contexts (project types, execution phases, and the present analysis (with 93% of the activities) can still be consid-
types of activity). From now on, a construction manager will be ered representative enough to draw valid conclusions. Additionally,
able to more realistically (thus accurately) anticipate how likely it is also expected that some degree of cancellation will occur among
and how much the activities in the project schedule will vary, that those 7% of activities (because frequently, new activities replace
is, last or cost something different. This might potentially improve others that are not eventually performed).
the quality and robustness of all construction schedules, for exam- In the same vein, there are many potential future research con-
ple allowing them to feed more advanced (nondeterministic) sched- tinuations following this research. Again, this study might be ex-
uling and simulation tools with more representative data. These
tended to analyze other types of projects and/or other more specific
techniques generally need a substantial amount of information from
types of activities (maybe at trade-level: concrete, steel, asphalt,
previous similar projects, which is rarely available. With the set of
earthworks, for example). The network topologies for other types
moments provided here, these techniques will be able to resort to
of projects may also be studied to anticipate to what extent current
average values for their activity durations and cost distribution
levels of activity variability might impact their final schedules. The
parameters depending on the type and/or execution phase of the
statistical distribution of activity (duration and cost) variability may
project. These distributions will also be able to assume noninde-
also be analyzed. This was not possible at the general activity level,
pendence between the stochastically generated activity durations
as discussed in this paper, but it should be possible for activities at
and costs values (thanks to the set of duration-cost correlation
their trade level.
values also published in this study). This is expected to enhance
A last conclusion derived from this research is that activity du-
future construction project monitoring and control as well as actual
ration variability is the actual foe in project monitoring and control.
project duration and cost forecasting accuracy.
This may not sound new to Lean Construction researchers and
The analysis developed has also provided some interesting in-
practitioners. However, this research has provided compelling em-
sights from its numerical perspective. One of the most relevant is
pirical evidence suggesting that activity variability really must be
that it has been shown that construction activities do not end late on
average. Instead, their high level of variability (around 60% of its taken more seriously. There is a need to develop more techniques
average duration) is the key factor eventually causing project-level that can effectively handle/restrain this variability. Value stream
delays. Such high levels of activity variability exacerbate the merge mapping and last planner have been some attempts to address this
event bias, a phenomenon by which whenever two or more sched- problem, but more are needed. This will open the door to new and
ule paths converge into a single one, the average completion times more effective approaches for tackling the widespread phenomenon
exceed the maximum average path durations. of construction projects ending late.
Actual activity costs, on the other hand, do tend to be higher
than planned (around 7%). This cannot be the result of price adjust-
ments or inflation because hardly any project lasted longer than a Data Availability Statement
year. Instead, the major project-level cost overruns are expected to
occur as a consequence of delayed start of activities located nearer All data generated or analyzed during the study are included in the
the end of the project. It has been demonstrated how most duration- published article or Supplemental Data.
cost correlation factors range within 0.40 and 0.70. The latter would
mean that activities that cannot start until their predecessors have
finished would start to incur costs before their actual execution. Acknowledgments
Many other interpretations can arise from the numerical results
The first author acknowledges the Spanish Ministry of Science,
of the four moments describing activity duration and cost variabil-
ity that refer to specific types of projects, phases of execution, or Innovation and Universities for his Ramon y Cajal contract (RYC-
activity types that have not been recounted here. Tables 6 and 7 give 2017-22222) cofunded by the European Social Fund. This work
more information for such a purpose. was also supported by the second author’s “Estancias de movi-
A limitation of this study is mostly connected to the composition lidad en el extranjero José Castillejo para jóvenes doctores, 2017
and sample size of the construction projects analyzed. A total of (Grant Ref. CAS17/00488)” and the fourth author’s “Estancias de
101 projects have been used here with a varied composition. How- profesores e investigadores senior en centros extranjeros Salvador
ever, this sample size could have been bigger. It must be clarified, de Madariaga 2018 (Grant Ref. PRX18/00381),” both from the
however, that accessing actual duration and cost information is Spanish Ministry of Science, Innovation and Universities. The first