Multipleparameter Coupling Metrics For Layered Componentbased Software
Multipleparameter Coupling Metrics For Layered Componentbased Software
based software
Abstract
Coupling represents the degree of interdependence between two software components.
Understanding software dependency is directly related to improving software
understandability, maintainability, and reusability. In this paper, we analyze the difference
between component coupling and component dependency, introduce a two-parameter
component coupling metric and a three-parameter component dependency metric. An
important parameter in both these metrics is coupling distance, which represents the
relevance of two coupled components. These metrics are applicable to layered
component-based software. These metrics can be used to represent the dependencies
induced by all types of software coupling. We show how to determine coupling and
dependency of all scales of software components using these metrics. These metrics are
then applied to Apache HTTP, an open-source web server. The study shows that coupling
distance is related to the number of modifications of a component, which is an important
indicator of component fault rate, stability and subsequently, component complexity.
1 Introduction
Dependencies between software components are not only associated with the type of
coupling between the components, but also upon the relevance of the two components.
Although the idea of interaction locality (increasing the coupling of relevant components
and decreasing the coupling of irrelevant components) is widespread and longstanding, it
has not been formalized and thoroughly studied. In this paper, we consider the relevance
(signified by the coupling distance measure) between two components as a factor that
affects the dependencies between them and propose two multiple-parameter coupling
metrics for layered component-based software systems.
The remainder of the paper is organized as follows: Sect. 2 reviews software coupling,
interaction locality, and coupling metrics. Section 3 describes the representation of
component dependency. We describe layered component structure in Sect. 4. Section 5
presents our coupling and dependency metrics. In Sect. 6, we show how to determine
component dependency for various kinds of coupling. Section 7 presents our application
studies on Apache HTTP. The conclusions, threats to validity, and future work appear in
Sect. 8.
Table 1 lists the definitions of coupling between two modules, the smallest scale
components. Usually, coupling between modules of two large-scale software components
is also used to represent the large scale component coupling (Bruegge and Dutoit 2004).
For example, if component C1 contains module A and module B , component C2 contains
module E and module F , and module A is parameter coupled to module E , we can say
component C1 is parameter coupled to component C2 .
It has been observed that most of the complex systems in the world, from physical
systems such as atoms and stellar galaxies to social systems such as organizations and
governments, are modular and hierarchically structured. A large system may consist of
subsystems, which consists of subsystems, and so on, through several multiresolutional
layers. The interactions between subsystems tend to decrease as we go upward in the
hierarchy. This is called interaction locality (Simon 1969). Generally speaking,
interaction locality can minimize the energy for the system to operate and accordingly
stabilize the system. In software systems, interaction locality is expressed via a widely
accepted design principle: increasing the coupling of relevant components and
decreasing the coupling of irrelevant components.
Interaction locality should not be used in isolation by itself. Instead, it should be used
together with other two design principles, modularity and hierarchy (Yu and Ramaswamy
2007). Design modularity and hierarchy means the decomposition of the software system
into different layers of components in order to separate concern and reduce system
complexity. Interaction locality is then applied to assign interactions between these
components. Consider an ideal system that consists of components C1 and C2 , which in
turn contain modules, m1 through m4 . Figure 1a depicts the modular and hierarchical
structure of the system. Figure 1b depicts the interaction locality: high interactions exist
between relevant (lower level) modules and low interactions exist between irrelevant
(higher level) components.
Fig. 1 An ideal system with (a) hierarchical structure; and (b) interaction locality (Yu and
Ramaswamy 2007)
One the one hand, because different types of coupling have different effects on software
complexity, we can use the definitions of coupling in Table 1 to compare the degrees of
dependency between software components. Considerable research has been done in this
area to derive software dependency metrics, including (Briand et al. 1999; Chidamber
and Kemerer 1994; Basili et al. 1996; Card and Glass 1990). In these studies, software
dependency and complexity metrics are proposed and validated for both structured
software and object-oriented software. These metrics consider different types of
interactions between classes/modules, methods/functions, and attributes/variables.
On the other hand, the interaction locality design principle has also been widely accepted.
For example, Basili et al. (1996) validated the speculation made by Chidamber and
Kemerer (1994) that deep inheritance is more of a complication than shallow inheritance.
Lüer et al. (2001) proposed to increase component distance (reduce component
interactions) to increase component evolvability. Yu and Ramaswamy (2007) presented a
method to verify modularity, hierarchy, and interaction locality of a software design.
However, to the best of our knowledge, interaction locality has not been formalized and
generally used in the derivation of software metrics.
In our previous work (Yu 2007), we extended the concept of coupling and defined
changes made to have a direct effect on the behavior of (the word “direct”
means that the dependency is not via some third component). Component is called
With this notation, the dependency of a component can be represented with all its
dependent components. Here we utilize two notations to represent the dependency of one
component. The first is a graphical representation. This notation was first introduced in
While there have been several definitions of software components (Brown 1997; Leavens
and Sitaraman 2000), in this paper, we consider a component from a logical perspective
and define it as an integral logical constitute (Mei et al. 2001). According to this
definition, all artifacts (classes, programs, packages, and so on) can be considered as
components. In a software system, there are two types of components: primitive
component and compound component. A primitive component is defined as the smallest
manageable unit (class in object-oriented software and module in structured software). A
compound component is composed of primitive components and/or other compound
components. Therefore, a software system can be represented by a component tree: the
leaf nodes are primitive components and the internal nodes are compound components,
with the primitive components at height 1. The height of a compound component can be
recursively defined as one plus the maximum height of its descendent components.
As mentioned in Sect. 2, the couplings defined in Table 1 reflect the interactions of two
software components. However, they do not reflect the structure of the product and could
not accurately represent the dependency of compound components. As in Fig. 5, PC2 is
dependent on PC1 , PC3 , and PC4 . Now suppose that these couplings are of the same
type (say, parameter coupling). If we consider the dependency of PC2 itself, there is no
difference among these couplings. However, if we consider the dependency of CC1 , the
coupling between PC2 and PC1 and between PC2 and PC3 are to be handled differently.
In a properly designed software system, related modules are composed into the same
component. The coupling between PC2 and PC1 does not affect the dependency of CC1 ,
but the coupling between PC2 and PC3 and between PC2 and PC4 may affect the
dependency of CC1 . Similarly, the couplings between PC2 and PC3 and PC2 and PC4
have different effects on the dependency of CC6 .
Therefore, traditional coupling definitions that consider only the type of dependency
between primitive components are insufficient to describe dependencies between
compound components. To consider the dependency between compound components, we
present a metric C (t, d) to measure the coupling between two components, where C
stands for coupling, with t representing the coupling type and d the coupling distance.
Thus, this metric has two parameters: coupling type and coupling distance. While
coupling type is determined by the nature of interactions between two software
components as defined in Table 1, coupling distance is determined by the relative
location of the two components in the component tree. Hence associated with any type of
coupling between two components, there is a corresponding coupling type and a coupling
distance. In the following subsections, we discuss how to represent component coupling
and component dependency by the coupling distance parameter.
the height of the lowest common ancestor of and in the component tree.
For example, in Fig. 5, supposing that all the coupling types are parameter coupling, the
lowest common ancestors for primitive components PC2 and PC1 , PC2 and PC3 , and
PC2 and PC4 , are CC1 , CC6 , and CC8 , respectively. The couplings between these
primitive components can be represented as C (parameter coupling, 2), C (parameter
coupling, 3), and C (parameter coupling, 4) respectively. Note that there may exist more
than one coupling between two modules. In this case, each of them is represented by the
corresponding C (t, d).
5.4 Discussion
C (t, d), the two parameter representation of component coupling, has one advantage over
the traditional one parameter representation. The second parameter, coupling distance,
represents the relevance of two coupled components. Usually, in software design,
relevant functions (methods) are grouped into one module (class) and relevant modules
(classes) are grouped into one package, and so on. Therefore, larger values of distance
coupling are unfavorable than smaller values of distance coupling, because a larger value
normally represents the presence of coupling between two relatively unrelated
components. With respect to program comprehension and understandability, coupling
between related components is easier to understand. For component maintenance,
changes to a software component may have effects on other components due to
component coupling; a smaller distance coupling value is preferable to a larger distance
coupling value, because a smaller distance coupling value is indicative of localized
adverse effects, and thus, in a small scale component, which is easier to manage.
A distance-1 coupling implies the coupling is within a component and it does not affect
the independence of the component, which makes this component highly independent of
other components. A distance-2 coupling indicates that the coupling is between
components that have the same parent (one-height-up) component and hence can be more
relevant than other larger distance couplings.
Therefore, coupling distance, together with the coupling type specified in Table 1,
composes a valuable two parameter coupling metric, C (t, d). This metric not only can be
used to compare the degree of dependencies brought about by different types of coupling,
but can also be used to compare the degree of relevance of the same type of coupling. For
example, considering component CC1 in Fig. 5, we can infer that coupling between CC1
and CC2 is viewed more favorably over coupling between CC1 and CC3 , even though
they have the same coupling type (parameter coupling), because they have different
coupling distances.
6 Determination of dependencies
It is clear from the above discussions that software dependency is largely induced by the
presence of software coupling. It is easy to automatically determine parameter coupling
and inheritance coupling. Parameter coupling is induced via function calls or message
passing. For example, if module m1 invokes a function (method) implemented in module
m2 , we say m1 is parameter coupled to (dependent on) m2 . Dependencies induced by
inheritance coupling can be identified by language specific keyword or semantics. For
example, Java uses a keyword extend to represent class inheritance. If module m1 is
inherited from module m2 , we say m1 is inheritance coupled to (dependent on) m2 .
In contrast, dependencies induced by common coupling and external coupling are more
complicated. Common coupling between two modules is identified with the definition
and use of a global variable: a definition of a variable x is a statement that assigns a value
to x , such as x = 5; t he use of a variable x is a statement that utilizes the value of x , such
as if (x > 6) return. Because definitions can affect uses but uses cannot affect definitions,
dependencies between components induced by global variables are induced by the
definition–use relationship (Yu et al. 2004). For example, if module m1 uses a global
variable that is defined in module m2 , we say m1 is common coupled to (dependent on)
m2 .
External coupling between two modules is identified with the write and read operations
to the same external medium, including file, database, and so on. A write operation is to
change the content of the external medium and a read operation is to utilize the content of
the external medium. Because write operations might affect modules that read the same
external medium but read operations can not affect modules that read/write to the same
external medium, dependencies between components induced by external medium are
induced by the write–read relationship. For example, if module m1 reads a file and
utilizes the content that is written by module m2 , we say m1 is external coupled to
(dependent on) m2 .
Consider the design quality of modules , 93% (99/107) of its primitive components are
well designed from the viewpoint of common coupling: changes to other components will
not affect any of them via global variables; reuse of any of these 99 components does not
need to consider their dependencies on other components via global variables.
In order to further validate this assumption, in this section we perform an empirical study
on Apache HTTP to investigate the relationship between component dependency
(represented with coupling distance) and the external properties of software products. The
empirical studies were performed on the primitive components of modules in Apache
HTTP version 2.2.
The coupling metrics presented in this paper are two dimensional and contains both
coupling type and coupling distance. To avoid the cross-effects of coupling type on the
results, we studied parameter coupling and common coupling separately.
First, we define two evaluation metrics, D parameter (parameter coupling distance) and D
common (common coupling distance). D parameter of a component equals to the sum of the
parameter coupling distances of all its dependent components and is expressed by the
distances of all its dependent components and is expressed by the formula: . The D
parameter and D common values of all 107 primitive components of modules in Apache HTTP
are calculated based on the inspection of source code of version 2.2 using lxr.
Second, we count the M value, which is the number of times these dependency-inducing
components have been modified, based on the change history of Apache HTTP. For this
measurement, the CVS log is used HTTP5 records a complete revision history of all the
components and is available online and supports easy data extraction. We used a self-
written Perl program to obtain the change record information for each of the 107
primitive components and count the number of times it is modified from its first version
to the current version 2.2.
Finally, we test the correlation between D parameter and M and D common and M. We expect to
find that a component with larger dependency value also has larger number of
modifications; therefore, we test the following null hypotheses.
To test these hypotheses, we need to calculate the correlation coefficient value that
indicates the strength of the relationship between the two variables: independent variable,
component dependency value (D parameter or D common), and dependent variable, the number
of modifications (M) made to the component, in the software revision history. Several
different correlation coefficients have been put forward, including Pearson’s correlation
coefficient and Spearman’s rank correlation coefficient (Nolan 1994). For Pearson’s
correlation coefficient to be valid, two variables should be normally distributed.
However, in this case, it is unlikely that either of these two variables has a normal
distribution. Therefore, we use Spearman’s rank correlation coefficient. If the rank
correlation coefficient proves to be statistically significant at the 0.05 level, we will reject
the null hypothesis.
The results of the hypothesis tests are in Table 11. The scatter plots showing the
relationship between parameter/common coupling distance and the number of
modifications are in Figs. 8 and 9. Figure 8 shows the measurements and Fig. 9 shows the
ranks of the measurements. Dashed linear trendlines are displayed in Fig. 9. In both tests,
the correlations are significant at the 0.01 level (two tailed). Therefore, we reject the null
hypotheses and conclude that there is significant linear correlation between dependency
value (D parameter or D common) of a component and the number of modifications (M) made to
this component.
Table 11 The results of hypothesis tests
Hypothesis Number of pairs of data Correlation coefficient Significance
H01 107 0.402 0.01
H02 107 0.299 0.01
Fig. 8 The scatter plot of the number of modifications of a component versus (a)
parameter coupling distance; and (b) common coupling distance
Fig. 9 The scatter plot of rank of number of modifications of a component versus (a) rank
of parameter coupling distance; and (b) rank of common coupling distance
The number of modifications made to a component is related to the quality and the
complexity of the component. A component may be modified for various reasons. For
instance, an error found in one component and an improvement requirement on the
functionality of the component could result in a direct modification to the component.
Moreover, because components are interdependent, changes made to other components
could indirectly require modifications to this component. Therefore, we can assume that
the number of direct modifications represents the quality of the component, such as
stability, fault density; the number of indirect modifications represents the complexity of
the component. Note that a component is said to be complex when it is interrelated with
many other components; changes made to other components require corresponding
changes on this component.
The fault density and complexity of a component are directly related to its dependency.
Similar conclusions have been achieved in earlier work (Kafura and Henry 1981; Selby
and Basili 1991; Troy and Zweben 1981). In these studies, the relationship between
coupling type and software quality were established. Our empirical study further reveals
the relationship between coupling distance and software quality measures: larger distance
coupling values have more detrimental effects than smaller distance coupling values on
software quality, including understandability, maintainability and reusability.
8 Conclusions, threats to validity, and future research
In this paper, we proposed a coupling metric and a dependency metric for component-
based software. In both metrics, a new and potentially important parameter, coupling
distance, which measures the relevance between two coupled components, is used. If a
software system can be represented as a layered component tree structure, the coupling
distance can be determined easily from the heights of the two components in the tree. As
a case study, we evaluated the dependency of Apache version 2.2 based on parameter
coupling and common coupling. A validation study was performed and found linear
relations exist between coupling distance and component quality.
There are several threats to the validity of our study. One threat to internal validity is the
accuracy of data. To reduce this threat, we use both open-source and self-written tools to
extract coupling data and modification data in order to avoid manual counting mistakes.
Another internal threat is that we only investigated the parameter coupling distance and
common coupling distance. Due to the limitation of Apache HTTP, we did not investigate
the inheritance coupling distance and external coupling distance. Therefore, to reduce this
threat, we plan to study other software products, including object-oriented software to
validate the relationship between external/inheritance coupling distance and software
qualities. The third internal threat comes from the measurement: coupling data is obtained
from one specific version of Apache HTTP (version 2.2) while the modification data of a
component is obtained from all versions of Apache HTTP. To reduce this internal threat,
more coupling data on different versions of Apache HTTP should be obtained and
examined against the modification data.
One construct threat to validity is the construction of the tree structure of Apache HTTP.
Currently, we use the package structure to represent component structure, which might
not be a representative of the system architecture. Another construct threat to validity is
that our coupling analysis is only based on static analysis and we did not consider
dynamic run-time coupling/dependencies. In static analysis, we only considered acyclic
dependencies, i.e., sets of dependencies with no recursive references. During run-time,
recursive, or cyclic dependencies could exist between software components. The external
threat to validity is that the study performed on Apache HTTP is not representative of
other component-based software products. To reduce these threats, more studies with
dynamic analysis should be performed on other software systems.
Due to the observed importance of coupling distance, our studies have the following
impacts on software design metrics, which also aptly captures our future research
directions:
1. Object-Oriented Design: the measurement of class coupling could be refined by
integrating coupling distance. For example, the CBO metric presented by Chidamber
and Kemerer (Chidamber and Kemerer 1994) only considers the number of objects
coupled to a specified object. In fact, different objects may have different relevancies
to a specified object and by applying the coupling distance parameter, the CBO metric
could be refined and revalidated.
2. Structured Design: the measurement of architecture design could also be refined. For
example, Card and Glass (Card and Glass 1990) defined the structural complexity of a
specified module as the square of the fan-out of a module. Fan-out is the number of
modules that are directly invoked by this specified module. In this paper, we show that
different modules may have different relevancies to a specified module. Therefore,
new metrics for structural complexity, data complexity, and system complexity could
be derived if the coupling distance parameter is introduced within these measurements.
Acknowledgements This work was based in part, upon research supported by the
National Science Foundation (CNS-0619069, EPS-0701890 and OISE 0650939), Acxiom
Corporation (# 281539) and NASA EPSCoR Arkansas Space Grant Consortium (#
UALR 16804). Any opinions, findings, and conclusions or recommendations expressed
in this material are those of the author(s) and do not necessarily reflect the views of the
funding agencies. The authors would like to thank Professor Stephen R. Schach of
Vanderbilt University for his many suggestions. The authors would also like to thank the
anonymous reviewers for their valuable comments and suggestions which greatly
improved the earlier version of this paper.
References
Abdurazik, A. (2007). Coupling-based analysis of object-oriented software, Ph.D.
Dissertation, George Mason University. Available at:
http://www.ise.gmu.edu/~ofut/rsrch/aynur-dissertation.pdf.
Banker, R. D., Datar, S. M., Kemerer, C. F., & Zweig, D. (1993). Software complexity and
maintenance costs. Communications of the ACM, 36(11), 81–94.
Basili, V. R., Briand, L. C., & Melo, W. L. (1996). A validation of object-oriented design
metrics as quality indicators. IEEE Transactions on Software Engineering, 22(10), 751–
761.
Biggerstaff, T. J., & Perlis, A. J. (1989). Software reusability: Concepts and models (Vol.
1). New York, NY: ACM Press.
Briand, L. C., Daly, J. W., & Wüst, J. K. (1999). A unified framework for coupling
measurement in object-oriented systems. IEEE Transactions on Software Engineering,
25(1), 91–121.
Briand, L. C., Morasca S., & Basili V. R. (1994). Defining and Validating High-Layer
Design Metrics, Computer Science Technical Report Series, Vol. CS-TR-3301, University
of Maryland at College Park, College Park, MD.
Bruegge, B., & Dutoit, A. H. (2004). Object-oriented software engineering using UML,
patterns, and Java. Upper Saddle River, NJ: Pearson Prentice Hall.
Card, D. N., & Glass, R. L. (1990). Measuring software design quality. Upper Saddle
River, NJ: Prentice-Hall.
Chidamber, S., & Kemerer, C. (1994). A metric suite for object oriented design. IEEE
Transactions on Software Engineering, 30(6), 476–493.
Dandashi, F. (2002). Software engineering: theory, application and practice: A method for
assessing the reusability of object-oriented code using a validated set of automated
measurements. In Proceedings of the 2002 ACM Symposium on Applied Computing, pp.
997–1003.
Frakes, W. B., & Succi, G. (2001). An industrial study of reuse, quality, and productivity.
Journal of Systems and Software, 57(2), 99–106.
Gibson, V. R., & Senn, J. A. (1989). System structure and software maintenance
performance. Communications of the ACM, 32(3), 347–358.
Harrison, R., Counsell, S., & Nithi, R. (2000). Experimental assessment of the effect of
inheritance on the maintainability of object-oriented systems. Journal of System and
Software, 52(2–3), 173–179.
Hassoun, Y., Johnson, R., & Counsell, S. (2004). A dynamic runtime coupling metric for
meta-level architectures. In Proceedings of the Eighth Euromicro Working Conference on
Software Maintenance and Reengineering (CSMR’04), pp. 339–346.
Kafura, D., & Henry, S. (1981). Software quality metrics based on interconnectivity.
Journal of Systems and Software, 2(2), 121–131.
Leavens, G., & Sitaraman, M. (2000). Foundations of component-based systems.
Cambridge, UK: Cambridge University Press.
Lim, W. (1994). Effects of reuse on quality, productivity, and economics. IEEE Software,
11(5), 23–30.
Lüer, C., Rosenblum, D. S., & van der Hoek A. (2001). The evolution of software
evolvability. In Proceedings of the 4th International Workshop on Principles of Software
Evolution, Vienna, Austria, September 2001, pp. 134–137.
Mei, H., Zhang, L., & Yang F. (2001). A software configuration management model for
supporting component-based software development. ACM SIGSOFT, 26(2), 53–58.
Offutt, J., Harrold, M. J., & Kolte, P. (1993). A software metric system for module
coupling. Journal of System and Software, 20(3), 295–308.
Page-Jones, M. (1980). The practical guide to structured systems design. New York:
Yourdon Press.
Price, M. W., & Demurjian, S. A. (1997). Analyzing and measuring reusability in object-
oriented design. In Proceedings of the 12th ACM SIGPLAN Conference on Object-
Oriented Programming, Systems, Languages, and Applications, pp. 22–33.
Selby, R. W., & Basili, V. R. (1991). Analyzing error-prone system structure. IEEE
Transactions on Software Engineering, 17(2), 141–152.
Stevens, W. P., Myers, G. J., & Constantine, L. L. (1974). Structured design. IBM Systems
Journal, 13(2), 115–139.
Troy, D. A., & Zweben, S. H. (1981). Measuring the quality of structured design. Journal
of Systems and Software, 2(2), 113–120.
Yu, L. (2007). Understanding component co-evolution with a study on Linux. Empirical
Software Engineering, 12(2), 123–141.
Yu, L., & Ramaswamy, S. (2007). Verifying design modularity, hierarchy, and interaction
locality using data clustering techniques. In Proceedings of the 45th ACM Southeast
Conference, Winston-Salem, NC, March 2007, pp. 419–424.
Yu, L., Schach, S. R., Chen, K., & Offutt, J. (2004). Categorization of common coupling
and its application to the maintainability of the Linux Kernel. IEEE Transactions on
Software Engineering, 30(10), 694–706.