0% found this document useful (0 votes)

30 views48 pages

Automating Extract Class Refactoring

This paper presents a method for automating the Extract Class refactoring process, which aims to improve class cohesion by identifying and separating strongly related methods into new classes. The proposed approach utilizes a two-step clustering algorithm based on graph theory to automatically determine the optimal number of classes to extract, outperforming previous methods. Empirical evaluations demonstrate that the new method significantly enhances cohesion without increasing coupling and is well-received by developers for its usefulness in guiding refactorings.

Uploaded by

donmatteo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views48 pages

Automating Extract Class Refactoring

Uploaded by

donmatteo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Empir Software Eng (2014) 19:1617–1664

DOI 10.1007/s10664-013-9256-x

Automating extract class refactoring: an improved

method and its evaluation

Gabriele Bavota · Andrea De Lucia ·

Andrian Marcus · Rocco Oliveto

Published online: 4 May 2013

Abstract During software evolution the internal structure of the system undergoes
continuous modifications. These continuous changes push away the source code from
its original design, often reducing its quality, including class cohesion. In this paper
we propose a method for automating the Extract Class refactoring. The proposed
approach analyzes (structural and semantic) relationships between the methods in
a class to identify chains of strongly related methods. The identified method chains
are used to define new classes with higher cohesion than the original class, while
preserving the overall coupling between the new classes and the classes interacting
with the original class. The proposed approach has been first assessed in an artificial
scenario in order to calibrate the parameters of the approach. The data was also used

Communicated by: Arie van Deursen

G. Bavota · A. De Lucia
Software Engineering Lab, University of Salerno, Via ponte don Melillo,
84084, Fisciano SA, Italy
G. Bavota
e-mail: gbavota@[Link]
URL: [Link]
A. De Lucia
e-mail: adelucia@[Link]
URL: [Link]

A. Marcus
SEVERE Group, Department of Computer Science, Wayne State University,
5057 Woodward Ave, Suite 14101.1, Detroit, MI 48202, USA
e-mail: amarcus@[Link]
URL: [Link]

R. Oliveto ( )
Department of Bioscience and Territory, University of Molise, C. da Fonte Lappone,
86090, Pesche IS, Italy
e-mail: [Link]@[Link]
URL: [Link]
1618 Empir Software Eng (2014) 19:1617–1664

to compare the new approach with previous work. Then it has been empirically evalu-
ated on real Blobs from existing open source systems in order to assess how good and
useful the proposed refactoring solutions are considered by software engineers and
how well the proposed refactorings approximate refactorings done by the original
developers. We found that the new approach outperforms a previously proposed ap-
proach and that developers find the proposed solutions useful in guiding refactorings.

Keywords Extract class refactoring · Cohesion · Coupling ·

Graph clustering algorithms

1 Introduction

During Object-Oriented software development, software engineers strive to define

classes with (i) strongly related and focused responsibilities, i.e., high cohesion, and
(ii) limited number of dependencies with other classes, i.e., low coupling (Stevens
et al. 1974). Unfortunately, programmers do not always have sufficient time to make
sure that the resulting source code conforms to such a development law (Fowler
1999). In particular, during software evolution the internal structure of the system
undergoes continuous modifications that makes the source code more complex and
drifts away from its original design. Classes grow rapidly because programmers often
add new responsibilities to existing classes rather than creating new classes. However,
when the added responsibility grows and breeds, the class becomes too complex and
its quality deteriorates (Olbrich et al. 2009). Classes implementing several responsi-
bilities, having a large number of attributes, operations, or both, are known as Blobs
(Brown et al. 1998).
Blobs generally have low cohesion and high coupling (Brown et al. 1998). Several
empirical studies provided evidence that high levels of coupling and lack of cohesion
are associated with lower productivity, greater rework, and more significant design
efforts for developers (Basili et al. 1995; Binkley and Schach 1998; Briand et al. 1999a,
b; Chidamber et al. 1994). In addition, lower cohesion and/or higher coupling of
classes have been shown to correlate with higher defect rates (Gyimóthy et al. 2005;
Liu et al. 2009; Marcus et al. 2008).
Classes with unrelated methods often need to be restructured by distributing some
of their responsibilities to new classes, thus reducing their complexity and improving
their cohesion. The research domain that addresses this problem is referred to
as refactoring (Fowler 1999; Mens and Tourwe 2004). In particular, Extract Class
refactoring is a technique for splitting classes with many responsibilities into different
classes. Two good heuristics were proposed in Fowler (1999) for class extraction:

– responsibility-based: identifying a subset of the data and a subset of the methods

having similar responsibilities, i.e., having high (structural and semantic) cohe-
sion;
– change-based: identifying subsets of the data attributes that usually change
together or are dependent on each other.

Following the responsibility-based heuristic, several approaches have been pro-

posed to support the Extract Class refactoring. We proposed an approach based on
graph theory that is able to split a class with low cohesion in two classes having
Empir Software Eng (2014) 19:1617–1664 1619

a higher cohesion, using a MaxFlow-MinCut algorithm (Bavota et al. 2011). An

important limitation of this approach is that classes often need to be split into more
than two classes. Such a problem can be mitigated using partitioning or hierarchical
clustering algorithms. However, such algorithms suffer from important limitations
as well. The partitioning algorithms require as input the number of clusters, i.e., the
number of classes to be extracted, while the hierarchical clustering algorithms require
to select one among many different solutions. Fokaefs et al. (2009) tried to mitigate
the problems concerned with hierarchical clustering algorithms by tuning a threshold
to limit the number of different refactoring solutions proposed by their approach.
However, their approach still requires an additional effort by the software engineer
who has to analyze different solutions in order to identify the one that provides the
most adequate division of responsibilities.
To overcome these limitations we propose an approach, which is able to automat-
ically suggest a suitable decomposition of the original class, while also identifying
the appropriate number of classes to extract. The proposed approach is based on
a two-steps clustering algorithm exploiting a graph-based representation of a class,
where the nodes represent the methods and the weights on the edges represent the
likelihood that two methods should be in the same class.
This paper extends a previous short paper (Bavota et al. 2010), which only re-
ported a preliminary evaluation of the proposed approach performed in an artificial
scenario on two open source systems. The performance of the proposed approach
is dependant on a set of parameters, used by the various algorithms it utilizes. We
empirically assess the configuration parameters of the proposed approach on a mas-
sive number of Blobs artificially created in five open-source software systems. We
also propose and evaluate here a heuristic to automatically calibrate the parameters
for a given system, based on the Principal Component Analysis of the values of the
method similarity metrics. In addition, the evaluation is significantly expanded here
and it also includes empirical evaluations on realistic usage scenarios. In particular,
the proposed approach has been evaluated in the following two empirical studies:

1. In the first study we asked 50 Master’s students to rate the refactoring solutions
suggested by the proposed approach on existing Blobs identified in two open-
source systems (Khomh et al. 2009). In this study we also evaluated the impact
of the refactoring operations proposed by our approach on the cohesion and
coupling of the object systems.
2. In the second study we identified and selected 11 classes in different versions
of open source systems that actually underwent extract class refactoring by
their original developers. Then, we asked 15 Master’s students to refactor these
classes and compared both the refactorings proposed by our approach and the
refactorings performed by the students with the refactorings performed by the
original developers.

The results show that the refactoring solutions proposed by our approach (i)
strongly increase the cohesion of the refactored classes without leading to significant
increases in terms of coupling; (ii) are considered useful by developers performing
extract class refactoring; and (iii) are able to approximate manually performed refac-
torings at 91 %, on average. In addition, we also compare the proposed extract class
refactoring method with a previous approach we proposed in Bavota et al. (2011),
which uses the same graph representation of the class to be refactored but a different
1620 Empir Software Eng (2014) 19:1617–1664

algorithm based on Max Flow-Min Cut. We will refer to our previous approach as
the Max Flow-Min Cut approach. The results clearly indicate that the new approach
outperforms the previous one and in the paper we discuss the reasons for that. The
experimental material and the raw data are available online for replication purposes
(Bavota et al. 2012).
The rest of the paper is organized as follows. Section 2 discusses the related
literature, while Section 3 presents the proposed approach. The empirical assessment
of the configuration parameters of our approach is presented in Section 4. Sections 5
and 6 report the two empirical studies, respectively. Finally, Section 7 concludes the
paper.

2 Related Work

A lot of effort has been devoted to the definition of automatic and semi-automatic
approaches for software refactoring. The recent increasing interest in this field by
the software engineering community led to the organization of international events
focused on the refactoring topic, like the ICSE 2011 4th Workshop on Refactoring
Tools (WRT 2011). The approaches proposed in the literature can be roughly
classified in two different categories: (i) approaches that identify source code com-
ponents which may require refactoring and (ii) approaches that (semi)automatically
perform refactoring operations. In the latter category there are approaches support-
ing move method refactoring (Seng et al. 2006; Tsantalis and Chatzigeorgiou 2009;
Oliveto et al. 2011), extract method refactoring (Maruyama and Shima 1999; Abadi
et al. 2009), refactoring focused on improving class hierarchy (Casais 1992; Moore
1996), combinations of different refactoring operations1 (O’Keeffe and O’Cinneide
2006; Bodhuin et al. 2007), and extract class refactoring (Fokaefs et al. 2009; Bavota
et al. 2011). The latter are the more closely related to our approach and thus we will
focus on them in our discussion of related work.
It is worth noting that many of the most commonly used refactoring activities
proposed in literature have been integrated in modern IDEs, such as Eclipse2 and
IntelliJ IDEA.3

2.1 Approaches to Automate Extract Class Refactoring

Fokaefs et al. (2009) use a clustering algorithm to perform Extract Class refactoring.
Their approach analyzes the structural dependencies existing between the entities
of a class to be refactored, i.e., attributes and methods. Using this information, they
compute the entity set for each attribute (i.e., the set of methods using it), and for each
method (i.e., all the methods that are invoked by a method and all the attributes that
are accessed by it) of the class. The Jaccard distance between all couples of entity
sets of the class is computed in order to cluster together cohesive groups of entities

1 These approaches do not provide support to extract class refactoring.

2 [Link]

3 [Link]
Empir Software Eng (2014) 19:1617–1664 1621

that can be extracted as separate classes. A hierarchical clustering algorithm is used

to this aim. Differently from our approach, only structural information is taken into
account to perform Extract Class refactoring. Moreover, while our approach is able
to automatically identify the appropriate number of classes that should be extracted
from a Blob class, the approach presented by Fokaefs et al., like all the approaches
based on agglomerative hierarchical clustering, requires the definition of a threshold
to cut the generated dendrogram. This is a tree diagram representing the output of a
hierarchical clustering algorithm: the leafs of the tree represent the entities to cluster
while the remaining nodes represent possible clusters the entities belong to, up to the
root representing a cluster containing all the entities. The distance between merged
clusters increases with the level of the merger (starting from the leaves towards the
root). This means that nodes (i.e., clusters) at a higher level group together entities
having higher distance (lower similarity) between them. In fact, the top node, i.e., the
root of the tree, groups all entities in a single cluster, while the bottom nodes (i.e.,
the leaves) place each entity in a distinct cluster. Finding the right level where to cut
the dendogram (i.e., determine the number of clusters), is a difficult problem without
prior knowledge about the class to be refactored. In Fokaefs et al. (2009), the authors
tried to mitigate this issue by proposing different refactoring opportunities that can
be obtained using different thresholds. However, the software engineer needs to
analyze the different solutions in order to identify the one that provides the most
adequate division of responsibilities.
As mentioned before, we proposed Bavota et al. (2011) a different approach
based on graph theory for Extract Class refactoring, that presents commonalities and
differences with the new approach presented in this paper. Similarly to the approach
presented in this paper, in our previous work (Bavota et al. 2011), a class to be split
is represented by a weighted graph, where each node represents a method of the
class and the weight of an edge that connects two nodes (methods) is a combination
of structural and semantic similarity measures between the two methods. However,
while in our previous work (Bavota et al. 2011) the graph is always split in two
sub-graphs (corresponding to the two extracted classes) using a MaxFlow-MinCut
algorithm, the new two-steps algorithm proposed in this paper overcomes this limita-
tion since it is able to split the graph in more subgraphs by automatically identifying
the appropriate number of classes (which may be more than two) that should be
extracted from a Blob class, thus resulting in a better division of responsibilities. We
directly compare the performance of the new approach with the performance of the
previous approach in our studies.
It is worth noting that, in some ways, Extract Class refactoring has many simi-
larities with the identification of objects or ADTs (Abstract Data Types) in legacy
systems. Such approaches generally exploit the relations existing between program
routines and global variables and/or user defined data types (see, e.g., Canfora
et al. 2001; Koschke et al. 2006; Tonella 2001; van Deursen and Kuipers 1999;
Girard and Koschke 2000, among the most recent work). Like these approaches, our
technique also takes into account the commonly accessed variables when computing
the likelihood that two methods should be in the same class. However, our approach
also considers other structural and semantic measures that relationships between two
methods.
Identifying objects in legacy systems, as well as the Extract Class refactoring, are
particular cases of the problem of clustering system components (or methods) into
1622 Empir Software Eng (2014) 19:1617–1664

modules (or classes). In particular, structural information (e.g., Fokaefs et al. 2009;
Christl et al. 2007; Sartipi and Kontogiannis 2001; Wiggerts 1997; Anquetil et al.
1999), semantic information (e.g., Kuhn et al. 2007), or a combination of semantic
and structural measures (Maletic and Marcus 2001) have been proposed to cluster
software components in order to support program comprehension or software re-
modularisation. Maletic and Marcus (2001) proposed the combination of semantic
and structural measures to cluster software components during program compre-
hension. It is worth noting that while (as discussed above) hierarchical clustering
algorithms needs a threshold to cut the identified dendrogram, partitioning clustering
algorithms need to know the number of clusters to build and thus the number of
classes to extract in case of Extract Class refactoring. Our approach overcomes these
problems by automatically defining the optimal number of classes to be extracted.

2.2 Approaches that Identify Refactoring Opportunities

The approach proposed in this paper, like most of the refactoring approaches
described above, can be applied only if a source code component to be refactored
has been identified (in our case, a Blob class). Several approaches presented in the
literature have focused the attention of the identification of source code components
that need refactoring. Such approaches are complementary to the one presented in
this paper. While many of these approaches have been proposed in the literature
(Simon et al. 2001; Tahvildari and Kontogiannis 2003; Du Bois et al. 2004; Marinescu
2004; Atkinson and King 2005; Trifu and Marinescu 2005; Joshi and Joshi 2009;
Khomh et al. 2009; Moha et al. 2010), we will focus our discussion on those able
to identify extract class refactoring opportunities in software systems.
Simon et al. (2001) provide a metric-based visualization tool to support the
software engineer in the identification of source code components that need refac-
toring. In particular, their approach is able to identify four kinds of refactoring
opportunities: move method, move attribute, extract class, and inline class. In Simon
et al. (2001) only structural metrics are used in the analysis of the source code.
Marinescu (2004) proposes a mechanism called “detection strategies” for formu-
lating metrics-based rules that capture deviations from good design principles and
heuristics. The detection strategies are formulated in different steps. Firstly, the
symptoms that characterize a particular bad smell should be defined (e.g., in case of
Blob high complexity, low cohesion, and access of “foreign” data). Second, a proper
set of metrics measuring these symptoms should be identified (e.g., Weighted Method
Count (WMC) for high complexity, Tight Class Cohesion (TCC) for class cohesion,
and Access to Foreign Data (ATFD) for measuring the access to external attributes
of a class). Having this information the next step is to define thresholds to classify
the class as affected (or not) by the defined symptoms. For example, establishing for
which values of TCC a class should by identified as a “low cohesive class”. Finally,
AND/OR operators should be used to correlate the symptoms, leading to the final
rule to detect the smells and thus, refactoring opportunities. The evaluation con-
ducted on two software systems shows how using customized “detection strategies”
it is possible to identify nine bad smells with an average accuracy of 70 %.
Trifu and Marinescu (2005) present an approach to support the decision making
process in object oriented refactoring. In particular, they exploit correlation between
Empir Software Eng (2014) 19:1617–1664 1623

structural anomalies, i.e., different types of code smells that often occur together, and
other structural and semantic information to build a pattern-like mapping of design
problems to the adequate treatments.
Joshi and Joshi (2009) present a method for identifying less cohesive classes in
a software system. Their approach is also able to pick out which class members
contribute to the lack of cohesion of the identified classes. This information can be
used to find candidates for refactoring, e.g., extract class, move method.
Khomh et al. (2009) propose an approach based on Bayesian Belief Networks
(BBNs) to specify design smells and detect them in programs. In this work the
authors focus the attention on the detection of Blob classes and thus, of Extract Class
refactoring opportunities. In particular, given a class C as input, the output of the
BBN is a probability that C is a Blob class. The evaluation is performed on two open
source systems by measuring precision and recall of the model with manually located
smells.
Moha et al. (2010) introduced DETEX, a method for the specification and detec-
tion of code and design smells. DETEX uses a Domain-Specific Language (DSL)
for specifying smells using high-level abstractions. Four design smells are identified
by DETEX, namely Blob class, Swiss Army Knife, Functional Decomposition, and
Spaghetti Code. The results achieved in the reported evaluation show that DETEX
is able to reach a recall of 100 % and a precision greater than 50 % in the detection
of the four above mentioned bad smells.

3 The Proposed Extract Class Re-factoring Approach

The proposed approach is able to extract two or more classes from a given class
with several responsibilities (e.g., a Blob (Brown et al. 1998)). The extracted classes
have higher cohesion than the original class and attempt to encapsulate related
responsibilities. Generally, a class with a high number of responsibilities exhibits low
cohesion. Cohesion has been defined by Stevens et al. (1974) as “the degree to which
the elements of a module belong together” and in the case of classes, it measures how
strongly related are the responsibilities implemented by a class (Chidamber et al.
1994).
Class cohesion is affected by several factors (e.g., attribute references, method
calls, semantic content, etc.) and our approach exploits all these factors to split a class
with low cohesion into a set of classes with higher cohesion. However, while splitting
a class into two or more classes increases the cohesion of the extracted classes, this
might happen at the expenses of class coupling. For this reason, our approach exploits
similarity measures between methods on which cohesion and coupling metrics are
based. In this way, the increase of cohesion should mitigate the increase of coupling.
The approach takes as input a class previously identified by the software engineer
(or automatically) as a candidate for refactoring. Figure 1 shows the Extract Class
Refactoring process. The top path of the process is similar to the Max Flow-Min
Cut approach (Bavota et al. 2011): the candidate class is parsed to build a method-
by-method matrix, a n × n matrix where n is the number of methods in the class to
be refactored. A generic entry ci, j of the method-by-method matrix represents the
likelihood that method mi and method m j should be in the same class. This step of
the refactoring process is described in more detail in Section 3.1.
1624 Empir Software Eng (2014) 19:1617–1664

Fig. 1 Class extraction process

Using the information in the method-by-method matrix, the second part (bottom
path) of the refactoring process, shown in Fig. 1, extracts the new classes from the
input Blob. In particular, a filtering step is used to remove spurious links and to
split the initial graph represented in the method-by-method matrix into disconnected
subgraphs. Then, we identify the chains of connected methods belonging to the
different subgraphs. Each computed chain represents a class to be extracted from the
original class. However, some of these chains could have a very short length (trivial
chains). To avoid the extraction of classes with a very low number of methods, we
merge each trivial chain with the most coupled non trivial chain to obtain the final
set of classes to be extracted from the original class. In Sections 3.2 and 3.3 we explain
in detail these two steps of our algorithm, while in Section 3.4 we present an example
of the application of our approach.

3.1 Method-by-Method Matrix Construction

The first phase of the refactoring process aims at building a method-by-method

matrix representation of the class to be refactored, where a generic entry ci, j of the
matrix represents the likelihood that method mi and method m j should be in the
same class. Similarly to the Max-Flow Min-Cut approach, this likelihood is computed
as a hybrid coupling measure between methods (which reflects the degree to which
they are related) obtained by combining three structural and semantic measures, i.e.,
the Structural Similarity between Methods (SSM) (Gui and Scott 2006), the Call-
based Dependence between Methods (CDM) (Bavota et al. 2011), and the Concep-
tual Similarity between Methods (CSM) (Poshyvanyk et al. 2009). These measures
capture three distinct ways in which methods relate to one another, each reflecting
a different type of coupling between methods (Bavota et al. 2011).
SSM is a structural measure capturing the attribute references in methods and it
is used to compute the class cohesion metric ClassCoh (Gui and Scott 2006). Let Ii
be the set of instance variables referenced by method mi . The SSM of mi and m j is
Empir Software Eng (2014) 19:1617–1664 1625

SSM has values in [0, 1]; the higher the number of instance variables the two methods
share, the higher the likelihood that the two methods should be in the same class.
ClassCoh is defined as the ratio of the sum of the similarities between all pairs of
methods to the total number of possible pairs of methods.
CDM is another structural measure that takes into account the calls performed
by the methods (Bavota et al. 2011). In particular, let calls(mi , m j ) be the number of
calls performed by method mi to m j and callsin (m j) be the total number of incoming
calls to m j . : CDMi→ j is defined as:
⎧
⎨ calls(mi , m j )
if callsin (m j) = 0;
CDMi→ j = callsin (m j)
⎩
0 otherwise.

CDMi→ j values are in [0, 1]. If CDMi→ j = 1 it means that m j is only called by mi .
Thus, mi and m j should be in the same class to reduce coupling between classes.
Otherwise, if CDMi→ j = 0 it means that mi never calls m j . In such a case, moving
the two methods in different classes does not result in increasing the coupling. To
ensure that CDM represents a commutative measure (like the other two measures)
the overall CDM of mi and m j is computed as follows:

CDM(mi , m j ) = max CDMi→ j, CDM j→i

Finally, CSM is a semantic measure introduced to define the Conceptual Cohesion

of Classes (Marcus et al. 2008) and used to define the Conceptual Coupling Between
Classes (Poshyvanyk et al. 2009). Two methods are conceptually related if their
(domain) semantics are similar, i.e., they perform conceptually similar actions. To
measure CSM, an advanced Information Retrieval (IR) (Baeza-Yates and Ribeiro-
Neto 1999) technique, namely Latent Semantic Indexing (LSI) (Deerwester et al.
1990), is used to represent each method as a real-valued vector that spans a
space defined by the vocabulary extracted from the code. The conceptual similarity
between two methods is then calculated as the cosine of the angle between their
corresponding vectors (Baeza-Yates and Ribeiro-Neto 1999):
→
−
mi · −→
m j
CSM(mi , m j ) = →
−
mi · −
→
m j

where − →
mi and −
→ are the vectors corresponding to the methods m and m , respec-
m j i j
→
−
tively, and x represents the Euclidean norm of the vector x (Baeza-Yates and
Ribeiro-Neto 1999). Thus, the higher the value of CSM the higher the similarity
between two methods. In short, the measure captures relationships between the
comments, identifiers, and other text present in the methods, based on word usages
in the entire code. It is clear that CSM depends on the consistency of naming
conventions used in the source code as well as on the comments contained in it.
1626 Empir Software Eng (2014) 19:1617–1664

All the used similarity measures have values in [0, 1]. Thus, we compute the
likelihood that methods mi and m j should be in the same class as:
ci, j = w SSM · SSM(mi, m j) + wCDM · CDM(mi, m j) + wCSM · CSM(mi , m j)
where w SSM + wCDM + wCSM = 1 and their values express the confidence (i.e.,
weight) in each measure.
It is worth noting that our choice of measures to use is not random, rather it
is based on the results from previous work (Bavota et al. 2011), where we have
shown that these measures are orthogonal, they capture different aspects of coupling
between methods, and are suitable for automating extract class refactoring.

3.2 Identifying Chains of Methods

The aim of this step is to remove from the graph represented by the method-by-
method matrix spurious (but light) structural and/or semantic relationships between
methods (Koschke et al. 2006). Indeed, due to the use of the semantic similarity be-
tween methods (that very un-likely is equal to zero) the initial graph representation
would be in general a complete graph (i.e., it contains all possible edges), or at least
a connected graph. We split the graph representing the class to be refactored into
disconnected subgraphs, containing strongly related methods. We filter the method-
by-method matrix, based on a threshold, minCoupling. All similarity values less than
the minCoupling threshold are converted to zero:

c if ci, j > minCoupling;

ci, j = i, j
0 otherwise.
There are many ways to define a threshold aimed at removing spurious relation-
ships between methods. A simple classification allows identifying two different kinds
of thresholds:
– constant threshold: the value of the threshold is fixed a priori, e.g., minCoupling =
0.1. This kind of threshold is simple to implement, but in general it is very difficult
to choose a priori a constant value to prune spurious relationships. Indeed, the
values in the method-by-method matrix depend on the Blob chosen to be refac-
tored. In fact, there may be cases where the matrix contains a lot of high values.
In this case, if the fixed threshold is high, it will probably remove the noise from
the matrix, e.g., spurious relationships between the methods of the class. Oth-
erwise, almost all the values will be left in the matrix. On the other hand, there
may be cases where the matrix contains a large number of very low values. In this
case, a high constant threshold will remove almost all the values from the matrix.
– variable threshold: the value of the threshold is selected taking into account
the characteristics of the given input. For example, minCoupling can be set as
the median of the values present in the method-by-method matrix. This kind of
threshold should resolve the problems derived by the use of a constant threshold
and should ensure more stable filter performances across the different inputs.
Choosing the best threshold in this case is also far from trivial.
We experimented with both constant and variable thresholds to empirically define
a heuristic for selecting the best threshold and we found that a variable threshold is
the best option in our application (see Section 4 for details).
Empir Software Eng (2014) 19:1617–1664 1627

After filtering the method-by-method matrix and splitting the graph into discon-
nected subgraphs, we identify the chains of connected methods belonging to the
different subgraphs. These chains represent the new classes to be extracted from the
original class.

3.3 Merging Trivial Chains

The set of computed chains (i.e., extracted classes) may include chains with a very
short length. To avoid the extraction of classes with a very low number of methods,
we use a length threshold minLength to identify trivial chains, i.e., chains with a
length less than minLength. In our approach we decided to set minLength = 3 since
it is unusual that a class extracted from a Blob and implementing a well-defined
set of responsibilities contains less than three methods. This minimum length can
be easily changed by the user, if needed. Then, we compute the (structural and
semantic) coupling between trivial and non-trivial chains and merge each trivial chain
with the non-trivial chain it is most coupled with. The coupling between chains is
calculated using the same measures used to calculate the coupling between methods.
Specifically, the coupling between chains Ci and C j is computed as the average
coupling between all possible pairs of methods from Ci and C j :
1
Coupling(Ci , C j ) = ci, j
|Ci | × |C j|
mi ∈Ci ,m j ∈C j

where |Ck | is the number of methods belonging to the chain Ck .

The methods of the original classes are distributed in different classes according
to the extracted chains. The attributes of the original class are also distributed among
the extracted classes according to how they are used by the methods in the new
classes, i.e., each attribute is assigned to the new class having the higher number
of methods using it.4 At the end of the automated process, the extracted classes are
analyzed by the software engineer who can accept the proposed restructuring as is,
or change it by moving methods and attributes from one class to another.

3.4 An Example of Our Approach

To better understand our approach, we present in this section an example of its

application. Let us assume that we are interested in refactoring the UserManagement
class shown partially in Fig. 2. Given its name and its set of methods, probably
the original responsibility of this class was to implement a set of operations that
allow to manipulate the User entity in the database. However, during software
maintenance, two new responsibilities were added to this class, i.e., the management
of the Teaching entity and the management of the Role entity. Figure 3 shows the
values for the similarity measures used in our approach, i.e., CDM, CSM, and SSM.
Moreover, it shows the values for the method-by-method matrix computed using the
following weights: wCDM = 0.2, wCSM = 0.3, and w SSM = 0.5.

4 If
a private field needs to be shared by two or more of the extracted classes, the implementation of
the needed getter and/or setter methods is left to the developer.
1628 Empir Software Eng (2014) 19:1617–1664

Fig. 2 Blob source code

Figure 4 shows how the proposed approach works to extract from the User-
Management class three new classes having better defined responsibilities than the
original class. The first part of the figure shows the graph that can be obtained from
the method-by-method matrix (note that the edges weighted with 0.0 are omitted),
while the second part of the figure shows the connected components obtained after
Empir Software Eng (2014) 19:1617–1664 1629

Fig. 3 Method-by-method matrix construction

1630 Empir Software Eng (2014) 19:1617–1664

Fig. 4 Method chains extraction

the matrix filtering. In this example we arbitrary set minCoupling = 0.2. Thus, all the
edges having weight lower than 0.2 (that represent spurious relationships between
methods) are removed from the graph. The extracted components correspond to
the preliminary method chains. The third part of the figure shows the refinement
of the method chains. In particular, a trivial chain composed of only one method
(checkUser) is added to the most coupled non trivial chain (i.e., C1 ). In the end, the
approach suggests splitting the original class into three new classes.

4 Parameter Calibration

The proposed approach has several configuration parameters: the weights of the
similarity measures (w SSM , wCDM , and wCSM ) and the threshold used to prune out
the spurious relationships between methods (minCoupling). While an assessment of
these parameters has been made previously (Bavota et al. 2011), we cannot just use
previous results, as the extract class refactoring algorithms are different and likely
the values of these parameters have a different impact on the new algorithms. For
this reason, in this section we conduct an empirical assessment of our approach with
the goal of defining and validating a heuristic to identify an optimal setting for these
parameters.
The context of our study is represented by five open source software systems,
namely ArgoUML 0.16, Eclipse 3.2, GanttProject 1.10.2, JHotDraw 6.0, and Xerces
2.7.0. ArgoUML (1,071 classes and 97 KLOC) is a UML modeling CASE tool
with reverse engineering and code generation capabilities. Eclipse (23,462 classes
and 1,710 KLOC) is a multi-language integrated development environment with an
extensible architecture through plug-ins. GanttProject (273 classes and 28 KLOC) is
a cross-platform desktop tool for project scheduling and management. JHotDraw
(275 classes and 29 KLOC) is a Java GUI framework for structured drawing
editors. Xerces (589 classes and 240 KLOC) is a family of packages for parsing and
manipulating XML files. It implements a number of standard APIs for XML parsing,
including DOM, SAX, and SAX2. Three of these systems, namely ArgoUML,
JHotDraw, and Eclipse, have also been used to assess the parameters of the Max
Flow-Min Cut approach (Bavota et al. 2011).
Empir Software Eng (2014) 19:1617–1664 1631

As we will explain later, for this evaluation, it is important to select systems

with relatively high quality. Figure 5 reports the box plot for some commonly used
class quality metrics, namely Lack of Cohesion of Methods (LCOM), Conceptual
Cohesion of Classes (C3), Coupling Between Object classes (CBO), and Message
Passing Coupling (MPC) calculated considering all the classes of the object systems.
The LCOM metric (Li and Henry 1993) counts the sets of methods in a class that are
not related through the sharing of some of its fields. It is an inverse metric—i.e., the
higher the value of LCOM, the lower the class cohesion. C3 is a conceptual cohesion
metric (Marcus et al. 2008), complementary to structural cohesion, which exploits
LSI (Latent Semantic Indexing) to compute the overlap of semantic information in a
class expressed in terms of textual similarity among methods. Higher values of C3 in-
dicate higher class cohesion. The CBO metric (Chidamber et al. 1994) represents the
number of classes coupled to a given class. This coupling can occur through method
calls, field accesses, inheritance, arguments, return types, and exceptions. The higher
the value of CBO, the higher the class coupling. Finally, the Message Passing
Coupling (MPC) (Li and Henry 1993) is another coupling metric based on method-
method interaction. MPC measures the number of method calls defined in methods

Fig. 5 Box plots of quality metrics for the systems used in the case study
1632 Empir Software Eng (2014) 19:1617–1664

of a class to methods in other classes, and therefore the dependency of local methods
to methods implemented by other classes. Higher MPC values indicate higher
coupling.
The analysis of these metrics shows that the overall quality of the object systems, in
terms of low coupling and high cohesion is comparable to each other. Although we do
not have a formal quality model, this claim is supported by the comparable quality of
the object systems with JHotDraw, which has been developed as a “design exercise”
and its design relies heavily on the proper use of well-known design patterns.

4.1 Planning and Execution

To analyze the influence of the configuration parameters we identified different

refactoring solutions on the same classes using different weights for the adopted
similarity measures and different values for the minCoupling threshold. For each
metric weight we varied this parameter (Weights) starting at 0 and increasing it until
1 by a step of 0.1. We exercised all possible combinations of such values ensuring that
w SSM + wCDM + wCSM = 1. Concerning the parameter minCoupling (Threshold) we
experimented the two kinds of thresholds described in Section 3, i.e., constant and
variable. In particular, we used four different constant thresholds and three different
variable thresholds. The constant thresholds we used are: 0.1, 0.2, 0.3, and 0.4. The
variable thresholds we considered are: the first (Q1 ), the second (Q2 ), and the third
(Q3 ) quartile, respectively, of the non-zero values in the method-by-method matrix.
Note that the use of quartiles allows to define a threshold that is less impacted—
as compared to the other descriptive statistics (e.g., mean)—by problems caused by
skewed distributions of values in the method-by-method matrix.
To have a high enough number of classes to assess the proposed approach, we
artificially created Blob classes with more responsibilities and low cohesion from
classes of the original systems. We developed a tool that randomly selects m ≥ 2
classes of the system—from the same package and/or from different packages—and
merges them in a single class Cm . We selected different values for the number of
classes to merge,5 i.e., m ∈ {2, 3}, and for each value of m we performed n = 50
different merging operations. Thus, on each of the five object systems, we created
100 artificial Blobs, 50 obtained by merging together 2 classes and 50 obtained by
merging together 3 classes. The Cm class is obtained by merging the methods and the
attributes of the selected classes. We excluded the constructors of the classes when
merging them in the artificial Blobs. If the classes to be merged contain methods
having the same signature, we renamed these methods adding to their name a suffix
composed by a random unique 4 digits number, e.g., _0343, and changed all the
calls to them consistently. Finally, we ignored methods inherited from superclasses.
Figure 6 shows an example of creating an artificial Blob by merging three different
classes, namely ManagerUser, ManagerTeaching, and ManagerClassroom. All the
merged classes have a method, i.e., check(int id), with the same signature that has
been renamed in the artificial Blob following the above rule. Moreover, all the
constructors have been ignored in the creation of the Blob.

5 It is worth noting that while the general experimental design is the same, the Max Flow-Min Cut
approach (Bavota et al. 2011) was evaluated on artificial Blobs created merging only two classes, as
it is only able to split a Blob in two classes.
Empir Software Eng (2014) 19:1617–1664 1633

Fig. 6 Example: creating an artificial Blob

By construction, the merged classes have a worse cohesion than the original classes
(see the online Appendix (Bavota et al. 2012) for the details). Note also that the
randomly selected classes are merged only if their cohesion is higher than the average
class cohesion in the system. The choice of this threshold was guided by the analysis
of the box plots reported in Fig. 5. As we can see most of the classes of the object
systems have a good cohesion but there is a small set of outliers with a really low co-
hesion. By considering the average cohesion as a threshold we exclude from our set of
classes these outliers, ensuring that the quality of the selected classes is rather good.
Once the mutated system is obtained, the proposed approach is applied to each
artificial Blob with the goal of reconstructing the original classes previously merged.
This is why it was important to select classes with high cohesion, because we can con-
sider them as the “golden standard”. Hence, to evaluate the results, the refactored
classes are compared with the original classes aiming at identifying the total number
of methods correctly and incorrectly moved in the split classes. To measure the
accuracy of the refactoring solutions we computed the MoJo eFfectiveness Measure
(MoJoFM) (Wen and Tzerpos 2004) between the original classes and those extracted
by our approach. The MoJoFM is a normalized variant of the MoJo distance and it
is computed as follows:

mno( A, B)
MoJoF M( A, B) = 1 −
max(mno(∀ A, B))

where mno( A, B) is the minimum number of Move or Join operations to perform in

order to transform the partition A into B, and max(mno(∀ A, B) is the maximum
possible distance of any partition A from the gold standard partition B. Thus,
MoJoF M returns 0 if a clustering algorithm produces the farthest partition away
from the gold standard; it returns 1 if a clustering algorithm produces exactly the gold
standard.
Summarizing, we performed on each object system 924 experiments, i.e., all the
possible combinations of weights, thresholds, and number of classes to be merged,
leading to 231,000 refactoring operations.
1634 Empir Software Eng (2014) 19:1617–1664

4.2 Analysis of the Results and Heuristics to Define the Configuration Parameters

Tables 1 and 2 report the best results—as measured with MoJoFM—achieved using
constant and variable thresholds respectively.6 The analysis of the results reveals that:
– the variable threshold generally provides better performance than the constant
threshold for the def inition of minCoupling. We obtained comparable results
between constant and variable thresholds only on GanttProject. On the other
systems, the variable thresholds provide an average improvement in terms of
MoJoFM of about 0.06. This means that by using a variable threshold, our
approach is able to better reconstruct the original classes merged to create the
artificial Blobs. In addition, the best overall results are achieved in each case
using as variable threshold the median (Q2 ) of the values of the matrix on all the
systems. In other words, the variable thresholds ensure a more stable filtering
performance across the different inputs, i.e., the different artificial Blobs to
be refactored. Regarding the constant thresholds, generally better results can
be achieved using a low value. As we can see in Table 1, none of the best
configurations results from using 0.4 as the constant threshold;
– the combination of structural and semantic measures considerably improves the
accuracy of our approach. As expected, the best results are achieved when all the
weights of the three cohesion metrics are greater than zero. This means that the
combination of structural and semantic measures is worthwhile, which confirms
the findings from previous work (Bavota et al. 2011).
– the optimal settings of the weights of the three cohesion measures is not stable
across the object systems. The results highlight that the best configuration of
weights sensibly changes across the object systems. Unlike the Max Flow-Min
Cut approach, where in general the best performances were achieved giving a
high weight (greater than 0.6) to the semantic similarity measure, with this new
approach it is quite difficult to identify an optimal setting of the weights for the
three measures, which could be used for any system. This means that a different
heuristic is required to identify an optimal setting of the weights for different
systems.
To better understand how the parameters of the proposed approach affect our
results, we statistically analyzed the influence of the factors Weights and Threshold
on the reconstruction accuracy of our approach (MoJoFM) through interaction
plots.7 The interaction plots confirmed that generally the best performances can
be obtained using as threshold the median, i.e., Q2 , of the non-zero values of the
method-by-method matrix and both structural and semantic measures. However, the
results also confirmed that the weights that produce optimal results are different
across the different object systems.
In consequence, we propose the use of the Principal Component Analysis (PCA)
of the method coupling data, to identify a heuristic able to set-up a customized
configuration of the weights for different software systems, resulting in near-optimal

6 The complete results achieved with all possible combinations of parameters can be found in Bavota
et al. (2012).
7 The interested reader can find the interaction plots for all systems in our online appendix (Bavota
et al. 2012).
Empir Software Eng (2014) 19:1617–1664 1635

Table 1 Best results achieved using constant thresholds

System Best configuration Merging 2 classes
Mean Median [Link].
ArgoUML wC DM = 0.1, w SSM = 0.3, wCSM = 0.6, minCoupling = 0.3 0.817 0.859 0.150
Eclipse wC DM = 0.0, w SSM = 0.8, wCSM = 0.2, minCoupling = 0.1 0.795 0.801 0.136
GanttProject wC DM = 0.3, w SSM = 0.5, wCSM = 0.2, minCoupling = 0.1 0.865 0.899 0.203
JHotDraw wC DM = 0.3, w SSM = 0.6, wCSM = 0.1, minCoupling = 0.1 0.810 0.877 0.189
Xerces wC DM = 0.1, w SSM = 0.3, wCSM = 0.6, minCoupling = 0.1 0.768 0.756 0.133
Merging 3 classes
Mean Median [Link].
ArgoUML wC DM = 0.2, w SSM = 0.4, wCSM = 0.4, minCoupling = 0.2 0.722 0.818 0.232
Eclipse wC DM = 0.1, w SSM = 0.4, wCSM = 0.5, minCoupling = 0.2 0.651 0.674 0.134
GanttProject wC DM = 0.1, w SSM = 0.6, wCSM = 0.3, minCoupling = 0.1 0.765 0.795 0.145
JHotDraw wC DM = 0.3, w SSM = 0.4, wCSM = 0.3, minCoupling = 0.2 0.745 0.756 0.132
Xerces wC DM = 0.1, w SSM = 0.6, wCSM = 0.3, minCoupling = 0.1 0.662 0.691 0.161

performances of our technique. We argue that PCA allows to identify the different
dimensions that describe a phenomenon (in our case, the coupling between pairs of
methods) and obtain an indication of the importance of each dimension (captured
by one or more coupling measures) in the description of this phenomenon (i.e.,
the proportion of variance). Table 3 shows the results of the PCA on all the object
systems. As we can see, the semantic measure is identified by the PCA as the measure
that describes most of the coupling between pairs of methods. In particular, the
proportion of variance for the semantic similarity measure is higher than 0.6 for all
the object systems. Moreover, in general, both structural measures are important, as
they describe some of the relationships between pairs of methods. This confirms the
finding previously highlighted from the analysis of Tables 1 and 2.
As expected, the proportion of variance values are rather different across the
different systems, so our question was whether using the proportion of variance val-
ues to define the weights of the similarity measures provides results of the MoJoFM

Table 2 Best results achieved using variable thresholds

System Best configuration Merging 2 classes
Mean Median [Link].
ArgoUML wC DM = 0.1, w SSM = 0.3, wCSM = 0.6, minCoupling = Q2 0.868 0.940 0.181
Eclipse wC DM = 0.1, w SSM = 0.3, wCSM = 0.6, minCoupling = Q2 0.901 0.913 0.115
GanttProject wC DM = 0.3, w SSM = 0.3, wCSM = 0.4, minCoupling = Q2 0.873 0.865 0.138
JHotDraw wC DM = 0.3, w SSM = 0.2, wCSM = 0.5, minCoupling = Q2 0.904 0.948 0.152
Xerces wC DM = 0.3, w SSM = 0.2, wCSM = 0.5, minCoupling = Q2 0.830 0.831 0.158
Merging 3 classes
Mean Median [Link].
ArgoUML wC DM = 0.2, w SSM = 0.4, wCSM = 0.4, minCoupling = Q2 0.767 0.792 0.200
Eclipse wC DM = 0.1, w SSM = 0.4, wCSM = 0.5, minCoupling = Q2 0.749 0.741 0.170
GanttProject wC DM = 0.4, w SSM = 0.2, wCSM = 0.4, minCoupling = Q2 0.750 0.718 0.160
JHotDraw wC DM = 0.2, w SSM = 0.3, wCSM = 0.5, minCoupling = Q2 0.773 0.730 0.172
Xerces wC DM = 0.4, w SSM = 0.2, wCSM = 0.4, minCoupling = Q2 0.683 0.685 0.195
1636 Empir Software Eng (2014) 19:1617–1664

Table 3 Results of PCA: PC1 PC2 PC3

rotated components
(a) ArgoUML
Proportion of variance 0.70 0.27 0.03
Cumulative proportion 0.70 0.97 1.00
CDM −0.02 0.00 0.99
SSM −0.15 −0.98 0.00
CSM −0.99 0.15 −0.02
(b) GanttProject
Proportion of variance 0.62 0.21 0.17
Cumulative proportion 0.62 0.83 1.00
CDM 0.02 −0.99 0.14
SSM 0.03 −0.14 −0.99
CSM 0.99 0.02 0.03
(c) JHotDraw
Proportion of variance 0.80 0.10 0.10
Cumulative proportion 0.80 0.90 1.00
CDM 0.07 −0.02 0.99
SSM 0.06 −0.99 −0.02
CSM 0.99 0.06 −0.06
(d) Xerces
Proportion of variance 0.66 0.24 0.10
Cumulative proportion 0.66 0.90 1.00
CDM −0.04 0.02 0.99
SSM −0.26 −0.96 0.00
CSM −0.96 0.27 −0.05
(e) Eclipse
Proportion of variance 0.73 0.25 0.02
Cumulative proportion 0.73 0.98 1.00
CDM −0.02 0.04 0.99
SSM −0.58 −0.81 0.02
CSM −0.81 0.58 −0.04

close to the optimal results shown in Table 2. Table 4 compares the results obtained
using the configuration parameters identified by the PCA proportion of variance
(PCA-based conf iguration) with the best results obtained in our experimentation
(best conf iguration). As we can see, the difference between the reconstruction
accuracy of the PCA-based conf iguration compared with the accuracy obtained using
the best configuration is very small. Indeed the difference of MoJoFM is never
higher than 0.04. We also executed the Wilcoxon test to compare the accuracy of the
two different configurations. The results on all object systems do not highlight any
statistically significant difference. This indicates that the PCA-based conf iguration
provides an accuracy similar to the best accuracy obtained by exercising all possible
parameter configurations. Given these findings we propose the following heuristics
to set the parameters of our approach in a real usage scenario:

– minCoupling: use the median of the non-zero values of the method-by-method

matrix as threshold to remove spurious relationships between the methods of the
class to refactor.
– weights: the weights assigned to the structural and semantic measures are estab-
lished based on the system under analysis by performing the PCA of the values
Empir Software Eng (2014) 19:1617–1664 1637

Table 4 Results reconstructing merged classes: PCA based vs best configuration

System #Merged Best configuration PCA-based configuration
classes
ArgoUML 2 wC DM = .1 w SSM = .3 wCSM = .6 (.87) wC DM = .0 w SSM = .3 wCSM = .7 (.84)
3 wC DM = .2 w SSM = .4 wCSM = .4 (.77) wC DM = .0 w SSM = .3 wCSM = .7 (.75)
Eclipse 2 wC DM = .1 w SSM = .3 wCSM = .6 (.90) wC DM = .0 w SSM = .3 wCSM = .7 (.88)
3 wC DM = .1 w SSM = .4 wCSM = .5 (.75) wC DM = .0 w SSM = .3 wCSM = .7 (.71)
GanttProject 2 wC DM = .3 w SSM = .3 wCSM = .4 (.87) wC DM = .2 w SSM = .2 wCSM = .6 (.84)
3 wC DM = .4 w SSM = .2 wCSM = .4 (.75) wC DM = .2 w SSM = .2 wCSM = .6 (.74)
JHotDraw 2 wC DM = .3 w SSM = .2 wCSM = .5 (.90) wC DM = .1 w SSM = .1 wCSM = .8 (.87)
3 wC DM = .2 w SSM = .3 wCSM = .5 (.77) wC DM = .1 w SSM = .1 wCSM = .8 (.74)
Xerces 2 wC DM = .3 w SSM = .2 wCSM = .5 (.83) wC DM = .1 w SSM = .2 wCSM = .7 (.79)
3 wC DM = .3 w SSM = .3 wCSM = .4 (.68) wC DM = .1 w SSM = .2 wCSM = .7 (.67)
In parenthesis the reconstruction accuracy, i.e., the average MoJoFM

of the similarity measures computed on all the classes of the system. The value of
the proportion of variance obtained for each measure will be used as the weight
for the corresponding measure.

4.3 Comparison with Previous Work

Using the artificial Blobs from the five open source systems, we also compared the
reconstruction accuracy of the proposed approach with the accuracy achieved by the
Max Flow-Min Cut approach, which is based on the same graph-based representation
of a class. Since the Max Flow-Min Cut approach is only able to split a Blob in two
classes we performed this comparison only on the artificial Blobs created merging
two classes.
Figure 7 reports the results achieved using for both approaches the best
configuration of parameters, respectively. As we can see, the reconstruction accuracy
of our new approach is always better than the reconstruction accuracy obtained with

Fig. 7 Comparison between Max Flow-Min Cut approach Our approach

our approach and the Max
Flow-Min Cut approach ArgoUML
0.75
0.87

0.71
Eclipse 0.90

0.70
GanttProject 0.87

0.73
JHotDraw 0.90

Xerces 0.64
0.83

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Average MoJoFM - Merging 2 Classes
1638 Empir Software Eng (2014) 19:1617–1664

the Max Flow-Min Cut approach. In particular, the average difference of MoJoFM
is 16.8 %. Note that our new approach is not only able to improve the reconstruction
accuracy of the Max Flow-Min Cut approach, but it also automatically derives that
the artificial Blobs have to be split in two classes, whereas the Max Flow-Min Cut
approach just splits the artificial Blobs in two classes, by construction.
We statistically analyzed the performances of the two approaches using the Mann-
Whitney test (Conover 1998). We chose this test as we cannot assume normality of
data and the test does not make normality assumptions. In particular, we used the
test to analyze the statistical significance of the difference between the reconstruction
accuracy provided by the two approaches. The results were intended as statistically
significant at α = 0.05. Table 5 reports the achieved results. As we can see, the recon-
struction accuracy of our new approach is significantly higher than the reconstruction
accuracy achieved by the Max Flow-Min Cut approach, for each system.
This result may seem surprising as the two approaches use the same structural
and semantic measures and the same graph-based representation of the class to split.
The difference is in the algorithm adopted to split the graph into sub-graphs. Thus,
to understand the reasons for performance gap between the two approaches, it is
important to point out the differences between the two algorithms:
1. Both algorithms include a filtering step aiming at removing spurious connections
between nodes. Indeed, due to the use of the semantic similarity between meth-
ods (that very unlikely is equal to zero) the initial graph representation would be
in general a complete graph (i.e., it contains all possible edges). In this case, the
Max Flow-Min Cut algorithm would always split a graph containing n nodes in
two graphs containing n − 1 and 1 node, respectively (Cormen et al. 2001). So,
filtering and removing some edges is needed in this case to avoid a trivial appli-
cation of Max Flow-Min Cut algorithm. However, filtering in this case does not
disconnect the graph, as the Max Flow-Min Cut Algorithm is the one splitting the
graph.
On the other hand, the filtering step of our new approach is much more intensive,
as it aims at splitting the graph in subgraphs representing loosely coupled
components. So this step is a key for our new extract class refactoring method.
The other steps of the new method consists of (i) identifying chains of nodes
belonging to the same subgraph (and then methods belonging to the same class)
and (ii) aggregating the small subgraphs (i.e., the trivial chains composed of less
than 3 methods) with the most coupled non-trivial chain previously identified.
This merging step is also very important to correct some surplus from the filtering
step.
2. The Max Flow-Min Cut algorithm needs as input the source and sink nodes
that ideally represent two methods belonging to the two different classes to be
extracted from the Blob. In our previous work (Bavota et al. 2011) the heuristic
used to identify the source and sink nodes consists of selecting the two nodes
in the graph connected by the edge with the lowest weight, i.e., they are the

Table 5 Mann-Whitney test: our approach vs Max Flow-Min Cut approach

ArgoUML Eclipse GanttProject JHotDraw Xerces
Statistically Yes (0.03) Yes (< 0.01) Yes (< 0.01) Yes (< 0.01) Yes (< 0.01)
significant difference
Empir Software Eng (2014) 19:1617–1664 1639

two least coupled methods (according to the structural and semantic similarity
measures used) in the Blob class. Clearly, in some cases this heuristic does not
work properly and it selects two methods that should instead be in the same
class. In this case the splitting performed by the algorithm will be negatively
affected, since it is guided by wrong initial assumptions. Our new technique does
not suffer of similar problems, as the splitting is performed by the filtering step.
In fact, the new technique helped reveal this previously unnoticed problem with
the selection of the source and sink nodes.
One can argue that by iteratively using an algorithm that splits a class in two, we
can obtain additional new classes from the old one (not only two). So, we also tried
to iteratively use the Max Flow-Min Cut approach to refactor the artificial Blobs cre-
ated by merging three classes (M1, M2 , and M3 ) together. In this case, in the first it-
eration the Max Flow-Min Cut algorithm would split the artificial Blob in two classes
E1 and E2 . Therefore, to be useful in an iterative usage, one of the extracted class
(suppose E1 ) should contain most of the methods of one of the original classes (sup-
pose M1 ) while the second extracted class (E2 ) should contain most of the methods of
the other two original classes (M2 and M3 ). The approach should then be re-applied
to E2 in order to extract M2 and M3 thus reconstructing the original classes. How-
ever, this rarely happens, and the distribution of methods of the three original classes
to the classes E1 and E2 achieved after the first iteration is usually more smoothed.
To verify this, we applied the Max Flow-Min Cut approach on the artificial Blob in
the first iteration and on both the extracted classes in the second iteration. Then, we
selected as refactoring solution the one achieving the highest reconstruction accuracy
(i.e., the highest MoJoFM) between the two generated. For example, suppose that E1
and E2 are the two classes extracted from the artificial Blob at the first iteration. In
the second iteration we apply the Max Flow-Min Cut approach on both E1 and E2
obtaining the classes E3 and E4 extracted from E1 and E5 and E6 extracted from
E2 . We then compute the reconstruction accuracy of the following two set of classes:
S1 = {E1 , E5 , E6 } and S2 = {E2 , E3 , E4 }. Supposing that the MoJoFM achieved by
S1 is 0.7 while the one achieved by S2 is 0.6, S1 is selected as the refactoring solution.
Figure 8 reports the achieved results. As we can see the iterative application
of the Max Flow-Min Cut produces worse results than those achieved by our new
technique. The gap of performance with our approach in this scenario is really high.
In particular, our approach obtained a reconstruction accuracy, in terms of MoJoFM,
22 % higher in average than the Max Flow-Min Cut approach.

4.4 Threats to Validity

The empirical results show a high reconstruction accuracy of the proposed approach.
A threat that could affect the validity of such a result is represented by the fact
that our approach was applied on artificial Blobs and thus reconstructing previously
merged classes might be trivial. However, in the previous Section 4.3 we mitigate
such a threat showing that the Max Flow-Min Cut approach is not able to reach the
same reconstruction accuracy when applied on the same set of artificial Blobs. To
further mitigate such a threat, we also analyzed the coupling between the classes to be
merged in order to understand if there is a correlation with the reconstruction accu-
racy of our approach. If two merged classes have no coupling between them, then the
outcome of splitting might be close to perfect. On the other hand if their coupling is
1640 Empir Software Eng (2014) 19:1617–1664

Fig. 8 Comparison between Max Flow-Min Cut approach

Our approach
our approach and the Max (Iteratively applied)
Flow-Min Cut approach when 0.55
applied iteratively ArgoUML 0.77

0.48
Eclipse 0.75

0.53
GanttProject 0.75

0.54
JHotDraw 0.77

0.52
Xerces 0.68

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Average MoJoFM - Merging 3 Classes

high it might be “translated” as similarity between the members of the merged class,
affecting the results. We used the Conceptual Coupling Between Classes (CCBC)
(Poshyvanyk et al. 2009) and the information-flow-based coupling (ICP) (Lee et al.
1995) to measure the coupling between the merged classes. Then, we measured the
statistical correlation between the coupling of the merged classes and the splitting ac-
curacy by computing the Pearson product-Moment Correlation Coefficient (PMCC)
(Cohen 1988). The results revealed no correlation on all the object systems.

5 Evaluating the Quality of the Refactoring Solutions

The assessment performed on artificial Blobs, discussed in Section 4, allowed us to

configure the parameters of our approach. In this Section we evaluate our approach
using quality metrics and developer assessments.
The proposed approach was used to refactor actual Blobs from two open source
software systems, namely GanttProject and Xerces. To set the parameters of our ap-
proach we used the heuristics presented in Section 4. In particular, for GanttProject
the configuration is wCDM = 0.2, w SSM = 0.2, wCSM = 0.6, and minCoupling = Q2 ,
while for Xerces is wCDM = 0.1, w SSM = 0.2, wCSM = 0.7, and minCoupling = Q2 .
This study was conducted only on GanttProject and Xerces because a set of (manu-
ally identified) Blobs for these systems (ten for Xerces and seven for GanttProject)
is reported in the literature (Khomh et al. 2009).

5.1 Research Questions and Planning

In the context of our study, the following research questions were formulated:

– RQ1 : What is the impact of the refactoring suggested by our approach on class
cohesion and coupling?
– RQ2 : Does the proposed refactoring results in a better division of responsibili-
ties, from a developer’s point of view?
Empir Software Eng (2014) 19:1617–1664 1641

To respond to our first research question (RQ1 ) we analyzed the changes in terms of
cohesion and coupling in the object systems when applying the refactoring operations
suggested by our approach. We expected an increase of cohesion (desired effect)
due to the split in different classes of the responsibilities implemented in the Blobs.
However, we also expected an increase of coupling (side effect), since splitting a class
in several classes usually results in an increment of the total dependencies between
classes. For these reasons coupling and cohesion should be measured together to
make a proper judgment on the complexity and quality of a system, since improve-
ment of cohesion usually comes at the expense of increase in coupling and vice versa
(Stewart et al. 2006). To measure the cohesion of the analyzed classes we used the
LCOM and the C3 metrics, while for the coupling we used the MPC metric, since it
allows to understand if the interactions due to method calls between the extracted
classes is increased after the refactoring operations suggested by our approach. The
cohesion and coupling metrics were measured before and after refactoring. To this
aim, we applied the refactoring operations suggested by our approach by using the
extract class functionality of Eclipse. As benchmark, we compared the results ob-
tained applying our new approach with those obtained using the Max Flow-Min Cut
approach.
With regards to the research question RQ2 we analyzed the refactoring operations
proposed by our approach from the developers’ point of view. To this aim, we
performed two experiments involving a total of 50 Master’s students in Computer
Science from the University of Salerno.8 Before the experiment, the students at-
tended a two hours seminar about the most common refactoring techniques, their
objectives and usefulness during the software lifecycle. During the semester in which
the experimentation has been carried out, the students were attending courses on
Advanced Software Engineering, Advanced Databases, Programming Languages and
Compilers, and Advanced Computer Networks. As for their background, all students
had in their Bachelor curriculum at least one exam on Object Oriented programming
(in Java) and on software engineering. Students voluntarily participated in the study
and no selection process was performed (i.e., all students who volunteered were
accepted). Finally, during the experiment students were allowed to leave, but no one
did.
The first experiment involved 30 students, who evaluated three different refactor-
ing operations for each of the seven Blobs from the GanttProject system: (i) the
refactoring suggested by our approach, (ii) the refactoring suggested by the Max
Flow-Min Cut approach, and (iii) a random refactoring. The second option was used
to provide the students with an alternative refactoring solution, which makes sense,
but which is likely worse than the refactoring solution suggested by our approach
(at least according to the results obtained in Section 4). The last option likely does
not make sense as a refactoring solution and was only considered to verify whether
participants seriously considered this assignment (i.e., a sanity check). For each of the
proposed refactoring the students had to express their level of agreement to the claim
“The proposed refactoring results in a better division of responsibilities” proposing a
score using a Likert scale (Oppenheim 1992): 1: Strongly disagree; 2: Disagree; 3:
Neutral; 4: Agree; 5: Fully agree. The students had 140 min to perform the assigned

8 All students voluntarily took part to the study.

1642 Empir Software Eng (2014) 19:1617–1664

task (on average 20 min for each Blob). The second experiment was conducted using
the same design, but involved 20 subjects and was performed on the ten Blobs of the
Xerces system with a time limit of 200 min (same 20 min average for each Blob).
To answer the research question RQ2 , the results achieved in the two experiments
were analyzed through boxplots and statistical tests. For the statistical analysis, we
decided to use the Mann-Whitney test (Conover 1998) since we cannot assume
normality of the data. We collected the ranking for each of the three proposed
refactoring solutions. Then, for each pair of considered approaches (e.g., our new
approach vs. the random refactoring), we used the Mann-Whitney test to analyze the
statistical significance of the difference between the scores assigned by the students
to the refactoring solutions of the two approaches. The results were intended as
statistically significant at α = 0.05.

5.2 Analysis of the Results

Table 6 reports information about the Blobs object of our study before and after the
refactoring suggested by our approach, and in particular the LOC and the number of
methods.9

5.2.1 Results of the Metrics Based Evaluation (RQ1 )

Table 7 reports the results achieved in our study in terms of cohesion. Looking at
Table 7, we can see that for almost all the classes the cohesion is sensibly improved.
In particular, the cohesion for the refactored Blobs is on average more than five times
better in terms of LCOM and more than two times better in terms of C3. Table 8
compares the results in terms of cohesion achieved by our approach with those
achieved by the Max Flow-Min Cut approach. As we can see, while the Max Flow-
Min Cut approach also achieves a sensible improvement of cohesion with respect to
the Blobs, this improvement is lower when compared with our approach.10
Regarding coupling, we analyzed both the increase of coupling limited to the
extracted classes (M PCextracted column in Table 9), as well as the overall increase
in coupling for all the classes affected by the refactoring operations (M PCaffected
column in Table 9). Thus, looking at the M PCextracted values we have an idea of
the impact on coupling of the new dependencies introduced by refactoring between
the classes extracted from the Blob, while M PCaffected also takes into account the
possible increase of coupling due to the dependencies between the client classes of
the Blob and the classes extracted from the Blob. As we can see, extracting different
classes from the original Blobs results in a small increment of the total MPC value of
the extracted classes. For instance, the refactoring of XIncludeHandler generated 4
different classes with a cohesion much higher than the cohesion of the original class,
e.g., LCOM for the original class is 4,652, while the extracted classes have an average
LCOM equals to 224 (more than 20 times better). On the coupling side the MPC for

9 Inthe number of methods we do not count the constructors (for both pre- and post-refactoring)
and any getters and setters methods that would be added after the refactoring. In this way the sum
of methods of the extracted classes is equals to the number of methods of the Blob class.
10 Theinterested reader can find the results by the Max Flow-Min Cut approach for each Blob in our
online appendix (Bavota et al. 2012).
Empir Software Eng (2014) 19:1617–1664 1643

Table 6 Refactoring solutions proposed by our approach on the 17 Blobs object of our study
System Class Split Pre-refactoring Post-refactoring
classes LOC Methods LOC Methods
Xerces AbstractDOMParser 2 1,775 45 522 15
1,259 30
Xerces AbstractSAXParser 3 1,360 55 155 11
241 12
967 32
Xerces BaseMarkupSerializer 2 1,275 61 152 10
1,123 51
Xerces CoreDocumentImpl 3 1,497 119 82 11
79 14
1,350 94
Xerces DeferredDocumentImpl 2 1,612 76 1,061 34
630 42
Xerces DOMNormalizer 2 1,291 31 33 13
1,268 18
Xerces DOMParserImpl 2 820 17 454 7
431 10
Xerces DurationImpl 2 953 44 351 16
640 28
Xerces NonValidatingConfiguration 2 403 18 123 3
284 15
Xerces XIncludeHandler 4 1,331 111 372 32
440 26
169 17
524 36
GanttProject GanttOptions 3 513 68 438 51
72 11
37 6
GanttProject GanttProject 3 2,269 90 2,086 71
124 6
118 13
GanttProject GanttGraphicArea 2 2,160 43 2,025 32
197 11
GanttProject GanttTree 2 1,730 48 1,382 42
423 6
GanttProject GanttTaskPropertiesBean 2 1,685 27 1,164 21
524 6
GanttProject ResourceLoadGraphicArea 2 1,060 29 873 21
227 8
GanttProject TaskImpl 3 329 46 234 27
69 10
44 9

the original Blob is 573, while the sum of the MPC of the extracted classes is 588.
Thus, the percentage increase in terms of MPC is only about 3 %. Also the increase
of coupling considering all the classes affected by the refactoring is just +4 % (from
1,650 to 1,724).
Table 10 reports the average M PCextracted and M PCaffected coupling values for
the two systems before refactoring, after refactoring with our approach and after
1644 Empir Software Eng (2014) 19:1617–1664

Table 7 Cohesion: results obtained refactoring the 17 Blobs

System Class Pre-refactoring Our approach
LCOM C3 LCOM C3
Xerces AbstractDOMParser 83 0.21 0 0.25
0 0.23
Xerces AbstractSAXParser 1,126 0.09 49 0.22
0 0.29
451 0.19
Xerces BaseMarkupSerializer 921 0.08 27 0.18
358 0.15
Xerces CoreDocumentImpl 6,825 0.05 143 0.19
190 0.33
3,322 0.12
Xerces DeferredDocumentImpl 0 0.14 0 0.18
41 0.20
Xerces DOMNormalizer 456 0.08 66 0.33
150 0.17
Xerces DOMParserImpl 132 0.24 15 0.38
54 0.33
Xerces DurationImpl 701 0.11 211 0.22
355 0.18
Xerces NonValidatingConfiguration 147 0.04 4 0.31
82 0.08
Xerces XIncludeHandler 4,652 0.08 30 0.42
602 0.14
75 0.22
188 0.27
Gantt GanttOptions 2,100 0.18 1,117 0.27
295 0.32
0 0.36
Gantt GanttProject 2,318 0.08 1,233 0.16
0 0.36
0 0.37
Gantt GanttGraphicArea 845 0.13 511 0.18
4 0.29
Gantt GanttTree 649 0.14 493 0.22
15 0.36
Gantt GanttTaskPropertiesBean 183 0.13 52 0.18
4 0.44
Gantt ResourceLoadGraphicArea 252 0.17 146 0.28
63 0.35
Gantt TaskImpl 884 0.27 119 0.31
58 0.38
3 0.41

refactoring with the Max Flow-Min Cut approach. As shown in Table 10, the
increment in coupling generated by our approach (+3.2 %) is just slightly higher
than the Max Flow-Min Cut approach (+2.3 %). Note that the (slightly) better
performance in terms of coupling ensured by the Max Flow-Min Cut approach is
an expected result, since it extracts a lower number of classes from the Blobs than
Empir Software Eng (2014) 19:1617–1664 1645

Table 8 Average cohesion: our approach vs. Max Flow-Min Cut approach
System Pre-refactoring Our approach Max Flow-Min Cut approach
LCOM C3 LCOM C3 LCOM C3
Xerces 1,504 0.11 256 0.23 588 0.21
Gantt 1,033 0.16 242 0.31 310 0.27
Average 1,310 0.13 257 0.27 473 0.24

our approach (i.e., 34 vs. 41). This clearly results in fewer dependencies between the
extracted classes.
As for the coupling measured for all classes involved in the refactoring operations
(columns M PCaffected ), the increase is also very small in terms of percentage. In

Table 9 Coupling: results obtained refactoring the 17 Blobs

System Class Pre-refactoring Our approach
M PCextracted M PCaffected M PCextracted M PCaffected
Xerces AbstractDOMParser 561 1,483 577 1,503
(221 + 356)
Xerces AbstractSAXParser 320 1,375 346 1,409
(13 + 77 + 256)
Xerces BaseMarkupSerializer 355 655 357 661
(7 + 350)
Xerces CoreDocumentImpl 341 1,693 344 1,705
(15 + 10 + 319)
Xerces DeferredDocumentImpl 392 996 433 1,045
(351 + 82)
Xerces DOMNormalizer 441 819 442 826
(1 + 441)
Xerces DOMParserImpl 285 833 291 862
(163 + 128)
Xerces DurationImpl 352 594 354 609
(42 + 312)
Xerces NonValidatingConfiguration 91 441 92 453
(32 + 60)
Xerces XIncludeHandler 573 1,650 588 1,724
(58 + 302 + 96 + 132)
Gantt GanttOptions 472 1,047 482 1,077
(443 + 20 + 19)
Gantt GanttProject 1,528 3,707 1,529 3,724
(1,492 + 8 + 29)
Gantt GanttGraphicArea 575 1,319 609 1,362
(605 + 4)
Gantt GanttTree 358 2,654 368 2,678
(262 + 106)
Gantt GanttTaskPropertiesBean 276 432 179 485
149
Gantt ResourceLoadGraphicArea 447 853 460 877
(355 + 105)
Gantt TaskImpl 45 414 52 428
(42 + 6 + 4)
1646 Empir Software Eng (2014) 19:1617–1664

Table 10 Average coupling: our approach vs. the Max Flow-Min Cut approach
System Pre-refactoring Our approach Max Flow-Min Cut approach
M PCextracted M PCaffected M PCextracted M PCaffected M PCextracted M PCaffected
Xerces 371 1,054 382 1,080 381 1,076
Gantt 528 1,489 547 1,519 540 1,510
Average 436 1,233 450 1,260 446 1,254

particular, our approach increases the number of dependencies for the classes
involved in the refactoring, on average, from 1,233 to 1,260 (+2.2 %) while the Max
Flow-Min Cut approach (Bavota et al. 2011) from 1,233 to 1,254 (+1.7 %).
In summary, both approaches are able to strongly increase class cohesion by
paying a small price in terms of coupling increase. Our approach is able to a higher
increase in cohesion compared to the Max Flow-Min Cut approach, but it also
increases slightly more the coupling between classes (expected result).

5.2.2 Results of the User Study (RQ2 )

Figures 9 and 10 show the boxplots summarizing the answers provided by the subjects
of our experiments to the questions regarding the division of responsibilities achieved
by the refactoring solutions of the different approaches.11 In particular, Fig. 9 reports
the answers given in our first experiment (30 subjects evaluating the refactoring
of Blobs from GanttProject) while Fig. 10 reports the answers given in our second
experiment (20 subjects evaluating the refactoring of Blobs from Xerces). From the
box plots illustrated in Figs. 9 and 10 we can see that, in both experiments, the
subjects gave higher scores on the Likert scale to the refactoring proposed by our new
approach. In fact, concerning the Blobs in the GanttProject system (see Fig. 9), the
median of the scores given to our approach is 4 (Agree) against 3 (Neutral) achieved
by the Max Flow-Min Cut approach and 2 (Disagree) of the random splitting. This
difference is more evident for the evaluation of the ten Blobs of the Xerces system
(see Fig. 10). In this case the median of the scores given to our new approach is 5
(Fully Agree) against 3 (Neutral) of the Max Flow-Min Cut approach and 1 (Strongly
Disagree) of the random splitting. Thus, in both cases, the refactoring solutions
suggested by our approach were considered as better divisions of responsibilities
than (i) a random splitting (as expected), and (ii) the splitting proposed by the Max
Flow-Min Cut approach. As further expected, the refactoring solutions proposed by
the Max Flow-Min Cut approach obtained in both experiments a better evaluation
than the random splitting, which was used only to understand if the subjects took the
experiment seriously.
The above considerations are also supported by statistical analysis. Table 11
reports the results of the Mann-Whitney tests used to compare the scores given
by the students to the refactoring operations achieved by the different approaches.
As we can see, the solutions suggested by our approach always obtain a statistically
significant higher score than the other solutions. Moreover, the refactoring solutions

11 Afine grained analysis of the scores assigned by the students is reported in our online Appendix
(Bavota et al. 2012).
Empir Software Eng (2014) 19:1617–1664 1647

OUR APPROACH MAX FLOW-MIN CUT RANDOM SPLITTING

APPROACH

Fig. 9 GanttProject: box plots of the ratings provided by students

suggested by the Max Flow-Min Cut approach obtains a statistically significant higher
score than the random splitting in both the experiments.
The quantitative data gathered from subjects allow us to positively answer our
second research question RQ2 : the proposed refactoring results in a better division
of responsibilities from a developer’s point of view. However, to have deeper insights
about the scores provided by the students we also analyzed some of the refactoring
operations proposed by our approach that have been generally marked with good
scores (or not) by the students.

Operations Positively Evaluated by Students For the Xerces system, two refactoring
operations positively evaluated by almost all the students are for the AbstractSAX-
Parser and XIncludeHandler classes. We observed that one of the classes extracted
from the AbstractSAXParser class can be classified as an Entity class, since it contains
only a set of attributes and the corresponding getter and setter methods. Concerning
the refactoring of the XIncludeHandler class, it is particularly interesting for two
reasons: (i) the refactoring operation suggested by our novel approach in this case
achieved the highest average score, i.e., 5, (the refactoring operation suggested by the

OUR APPROACH MAX FLOW-MIN CUT RANDOM SPLITTING

APPROACH

Fig. 10 Xerces: box plots of the ratings provided by students

1648 Empir Software Eng (2014) 19:1617–1664

Table 11 Results of the Mann-Whitney test

α
First experiment Second experiment
Our approach vs Max Flow-Min Cut approach < 0.01 < 0.01
Our approach vs random splitting < 0.01 < 0.01
Max Flow-Min Cut approach vs random splitting < 0.01 < 0.01

Max Flow-Min Cut approach for the same Blob obtained an average score of 2.7),
and (ii) this is the case with the highest difference in terms of number of extracted
classes with respect to the Max Flow-Min Cut approach (4 vs. 2).
To better analyze this case we report in Fig. 11 the topic maps (Kuhn et al. 2007)
representing the main topics of (i) the original Blob, (ii) the classes extracted by our
approach, and (iii) the classes extracted using the Max Flow-Min Cut approach. The
topic map for a class C is computed analyzing the term frequency in the methods of
C. In particular, we count for each term present in C (excluding the java keywords),
the number of methods that contain it. The five most frequent terms, i.e., the terms
present in the highest number of methods, are then used to construct the topic map
of C that, for this reason, is represented by a pentagon where each vertex represents
one of the main topics. Each vertex is connected to the center of the pentagon by
an axis representing the percentage of methods in the class that implements the
corresponding topic. The graphical representation of the main topics of C is then
obtained by tracing lines between the point on each of the five axes indicating the
percentage of methods belonging to C that implement the corresponding topic. The
methods in XIncludeHandler implement the XInclude handling of XML document
according to the W3C recommendations. The XInclude functionality allows to re-use
an XML document including it into other XML documents. The main topics in the
class are reported in the right side of Fig. 11. As we can see, the most frequent terms
are: XML (the kind of document involved), DTD (Document Type Definition, a set
of declarations used to define the document type for markup languages like XML),
Include (the main responsibility of the class), Error (the management of the possible

Fig. 11 Topic Map of XIncludeHandler pre and post refactoring

Empir Software Eng (2014) 19:1617–1664 1649

errors derived by the XInclude operation), and Augmentations (the infoset augmen-
tation that can be used to modify an XML infoset during schema validation). Thus,
although the main responsibility of this class is the implementation of the XInclude
handling, it also implements some auxiliary (and poorly related) responsibilities. The
application of our approach to XIncludeHandler produced four new classes, each one
specialized in one particular responsibility: Class1 primarily deals with the Document
Type Definition, Class2 with the infoset augmentation of XML documents, Class3
with the implementation of the XInclude operation, and Class4 with the management
of possible errors derived by the XInclude operation. Concerning the refactoring
proposed by the Max Flow-Min Cut approach (bottom part of Fig. 11), we can
observe that the two extracted classes still represent a mixture of different topics,
although the distribution of responsibilities is better than the original Blob.

Operations Negatively Evaluated by Students A particularly interesting case is rep-

resented by the refactoring of the NonValidatingConf iguration class. In particular,
our approach splits the original Blob containing 18 methods into two classes, one
composed of 15 methods and the other having only three methods (see Table 6).
The extraction of these three methods from the original class does not achieve a
good distribution of responsibilities among the new classes. However, analyzing the
original class we observed that it is not easy to find a meaningful splitting from a
functional point of view for this class. Although this class has been marked as a Blob
in Khomh et al. (2009), probably the Extract Class refactoring is not the best way to
improve its quality. This is also supported by the fact that although the refactoring
proposed by our approach achieved a rather low average score (i.e., 3.2), it was still
the highest one for this Blob.
With regards to the GanttProject software system, the only case of refactoring
negatively rated by the students is represented by the GanttTaskPropertiesBean
class. This is an expected results, since in our previous work (Bavota et al. 2011)
it was observed that for this class it is difficult to achieve a meaningful division of
responsibilities using the Extract Class refactoring. Indeed, this Blob can be classified
as “Data God Class” or “Lazy Class” (Fowler 1999) because the class holds a lot of
the system’s data in terms of number of attributes (i.e., it has 67 attributes). In this
case, as suggested in Fowler (1999), other types of refactoring should be applied to
improve the quality of the class, i.e., developers could redistribute the attributes of
the Blob to other classes, closer to the data.

5.3 Threats to Validity

In this section we discuss the threats that could affect the validity of the results of this
study.

5.3.1 Software Metrics Evaluation

In our study we measured the increase in cohesion/coupling caused by the extract
class operations suggested by our approach. To measure cohesion and coupling
we employed three well-established quality metrics, i.e., LCOM for the structural
cohesion, C3 for the semantic cohesion, and the M PC for the coupling. As in all
the software metrics evaluations, there is a risk that the improvement (in our case
of cohesion) achieved by applying the proposed refactoring operations is obtained
1650 Empir Software Eng (2014) 19:1617–1664

by construction. In fact (i) LCOM is based on the instance variables shared by

the methods implemented in a class, information exploited by our approach and
(ii) the C3 metric is computed using the Conceptual Similarity between Methods,
also exploited by our approach to capture overlap of semantic concepts between
methods in the Blob class. For these reasons, even if a software metric evaluation
is needed to verify that a new refactoring approach does not negatively affect the
cohesion and coupling of the system, we believe that this kind of evaluation cannot
be the only type of experimentation of a new technique (as done in several previous
papers (Praditwong et al. 2011; Seng et al. 2006; Abdeen et al. 2009; O’Keeffe and
O’Cinneide 2006; Seng et al. 2005)). So, besides achieving an increase of cohesion it
is necessary to show that the suggested refactoring operations are consistent with
the way developers would perform a refactoring. This is the reason why we also
performed user studies.

5.3.2 Subjects and Objects

In this study, 50 graduate students evaluated the refactoring solutions proposed by
our new approach, the Max Flow-Min Cut approach, and a random splitting. The
students did not know the goal of our study or the techniques which produced
the suggested refactoring solutions, to avoid bias. Moreover, the three refactoring
solutions to be evaluated were presented in a random order.
The type of subjects involved in our study, i.e., Master’s students, represents an
important threat related to the generalization of the results. The students had good
analysis, development, and programming experience and they can be considered as
junior industrial analysts. In addition, as highlighted by Arisholm and Sjoberg (2004)
the difference between students and professionals is not always easy to identify.
Since there are several differences between industrial and academic contexts we
plan to replicate the experiment with industrial subjects to corroborate the achieved
findings. Also, it is possible that subjects did not fully understand the code they
judged, since they were not the original developers of the object systems. This threat
is in part mitigated by the empirical study presented in Section 6 where (i) we
gathered more qualitative answers from the partecipants and (ii) we compared the
refactoring solutions proposed by our approach with those identified by the original
developers of six open source systems.
Another threat to the generalization of our findings is related to the limited
number of real Blobs analyzed (7 in the first and 10 in the second experiment,
respectively). However, this is the realistic number of tasks that we could possibly
evaluate in on-line experiments lasting for 2–3 h. It is not easy to perform such
experiments using a substantially larger number of Blobs, unless they are conducted
in multiple sessions.

5.3.3 Experimental Design

An important threat is related to the claim rated by the subjects with respect to
the different refactoring solutions evaluated (i.e., The proposed refactoring results
in a better division of responsibilities). In fact, it seems unlikely that when splitting a
Blob the extracted classes exhibit a worse division of responsibilities than the Blob.
However, in this study we aimed at conducting a quantitative study to assess how
much the refactoring proposed by our approach was considered as a better division
Empir Software Eng (2014) 19:1617–1664 1651

Table 12 Analysis of the refactoring operations

System Class Response
PhD student Master student I Master student II
GanttProject GanttOptions 5 5 5
GanttProject 5 5 5
GanttGraphicArea 4 5 5
GanttTree 4 5 4
GanttTaskPropertiesBean 3 4 3
ResourceLoadGraphicArea 4 5 5
TaskImpl 4 5 5
Xerces AbstractDOMParser 5 5 5
AbstractSAXParser 5 5 5
BaseMarkupSerializer 4 5 5
CoreDocumentImpl 4 4 4
DeferredDocumentImpl 1 3 2
DOMNormalizer 5 5 5
DOMParserImpl 5 5 5
DurationImpl 3 4 3
NonValidatingConfiguration 3 3 3
XIncludeHandler 5 5 5
Average 4.1 4.6 4.4
1: Strongly disagree; 2: Disagree; 3: Neutral; 4: Agree; 5: Fully agree

of responsibilities than the selected Blobs. In addition to the refactoring solutions

suggested by our approach we also provided the students with the refactoring
suggested by the Max Flow-Min Cut approach (Bavota et al. 2011) and a random one.
It is worth noting that the students did not considered meaningful most of the random
refactoring (which means that splitting does not necessarily mean a better division of
responsibilities), while they generally considered good (and better than the original
Blob) both the refactoring suggested by the new approach and by the Max Flow-Min
Cut approach and expressed a significantly higher preference for the refactoring of
the novel approach. Thus, we are confident that the positive scores provided by the
students to the refactoring solutions proposed by our approach represent reliable
data to assess the quality of the splitting operations proposed by our approach.
Since students evaluated three different refactoring solutions, another important
threat is that they might have rated as meaningful the “least worst” proposed
refactoring. We clearly explained to the students that for each analyzed Blob they
could rate all the refactoring solutions analyzed with low scores as well as with high
scores (there was not necessarily a winner to identify). However, to mitigate such a
threat we asked an additional Ph.D. student and two master students12 to evaluate
for each of the 17 experimented Blobs only the refactoring solutions proposed by our
approach. As in the two cases before, for each refactored class, the students had to ex-
press their level of agreement to the claim “The proposed refactoring results in a better
division of responsibilities” proposing a score using the same Likert scale used in the

12 To avoid bias in the experiment none of the authors have been involved in this evaluation.
1652 Empir Software Eng (2014) 19:1617–1664

experiments. These students were familiar with the two systems and also known to
the authors as extremely serious and reliable. Table 12 reports the answers provided
for each analyzed class. As we can see, they generally assigned high scores to the
analyzed refactoring operations. Moreover, the cases were the students negatively
evaluated the proposed refactoring operations are almost the same as identified in
the two experiments, e.g., DeferredDocumentImpl and NonValidatingConf iguration.
We are quite confident that the results achieved in our studies reflect well the quality
of the refactoring solutions proposed by the approaches.
However, the threats related to this experimental design still remains, due to the
nature of the study. The user study presented in Section 6 provides a qualitative eval-
uation of the proposed approach, in part to overcome some of the threats discussed
here.

6 Evaluating the Usefulness of the Refactoring Solutions

In this section we present a second study, performed with additional 15 Master’s

students from the University of Salerno,13 aimed at gathering more data from devel-
opers about the usefulness of the refactoring solutions suggested by our approach. As
before, the subjects participated voluntarily in the study and had the same academic
background as the students involved in the previous study. The study has been con-
ducted on a set of classes extracted from open source systems that underwent extract
class refactoring by the original developers. The subjects had to refactor these classes
using as initial suggestion the refactoring solutions proposed by our approach. Unlike
the study discussed in Section 5, in this study we also have an oracle (the refactoring
of the original developers) to compare with the suggested solution and the refactor-
ing performed by the students. Moreover, this time we performed the evaluation off-
line, by sending all the material needed to perform the experiment via e-mail. We
gave subjects two weeks to perform the required tasks. The experimental material
as well as the raw data of this study are available online for replication purposes
(Bavota et al. 2012).

6.1 Research Questions and Planning

In the context of this study, the following research question has been formulated:

– RQ3 : Are the refactoring solutions suggested by our approach useful for devel-
opers when performing extract class refactoring?

To obtain the objects needed by our study we mined six open source systems (i.e.,
Apache HSQLDB, ArgoUML, JEdit, JFreeChart, JHotDraw, Xerces) looking for
extract class refactoring operations performed during their history by the original
developers. We used Ref-Finder (Prete et al. 2010) to identify the refactoring

13 Noneof the 50 students involved in the user study reported in Section 5 has been involved in this
experiment.
Empir Software Eng (2014) 19:1617–1664 1653

operations performed among two subsequent versions of the same system. Ref-
Finder is a tool able to identify 63 different types of refactoring, but unfortunately not
the extract class one. However, the latter can be identified by Ref-Finder as a set of
move method and move f ield operations from the original class to the new extracted
classes. We manually validated these sets of move method and move field refactoring
retrieved by Ref-Finder to identify extract class refactoring operations performed
by the original developers. In total, we identified eleven meaningful extract class
refactoring operations performed by the original developers, presented in Table 13.
To answer our research question we provided each subject the eleven classes to
refactor together with the refactoring solution proposed by our approach. Then, since
for each of the eleven identified classes we have the original class as well as the new
classes extracted by the developers, we can answer RQ3 from both a qualitative and
a quantitative point of view.
As for the qualitative analysis, we asked the subjects the following questions:

1. Would you split this class?

(a) if YES:
i. Why?
ii. Would you split the class differently than the provided refactoring solu-
tion? Why?

Table 13 Extract class refactoring operations identified in the six analyzed systems
System Original class Extracted classes
Apache HSQLDB Database (41) Database (27)
SchemaManager (14)
Select (14) Select (7)
Result (7)
UserManager (13) UserManager (10)
GranteeManager (3)
ArgoUML FileGeneratorAdapter (9) FileGeneratorAdapter (3)
TempFileUtils (6)
Import (10) Import (7)
ImportCommon (3)
JEdit JEditTextArea (214) JEditTextArea (22)
SelectionManager (11)
TextArea (181)
JFreeChart JFreeChart (24) JFreeChart (16)
Plot (8)
NumberAxis (20) NumberAxis (16)
ValueAxis (4)
JHotDraw DefaultApplicationModel (14) DefaultApplicationModel (4)
AbstractApplicationModel (10)
Xerces XMLDTDValidator (69) XMLDTDValidator (38)
XMLDTDProcessor (31)
XMLSerializer (25) XMLSerializer (12)
DOMWriterImpl (13)
In parenthesis the number of methods in each class
1654 Empir Software Eng (2014) 19:1617–1664

iii. Did you find the provided refactoring solution useful as starting point to
perform your refactoring? Why?
(b) if NO:
i. Why?
As for the quantitative analysis, we measured how much the refactoring produced
by the students (i) was different than the solution suggested by our approach and
(ii) approximated the refactoring performed by the original developers. We used the
MoJoFM (Wen and Tzerpos 2004) to measure the similarity between the refactoring
performed by the students, the ones proposed by our approach, and those performed
by the original developers. Moreover, for each of the eleven classes object of our
study, we also measured how far is the refactoring suggestion proposed by our
approach from the refactoring performed by the original developers.

6.2 Analysis of the Results

For each of the eleven classes object of our study, Table 14 shows the percentage of
subjects answering “YES” to the three YES/NO questions of our survey. For exam-
ple, 13 out of the 15 students involved (87 %) would split the class Database and 8 of
them (62 %) would split the class differently than the solution suggested by our ap-
proach. However, all these 13 subjects found the provided refactoring solution (i.e.,
the one proposed by our approach) a good starting point to perform the refactoring.
The analysis of Table 14 reveals interesting results. First of all, the subjects would
not always split the provided classes. In particular, there are two of the eleven classes
(i.e., Select and Import) for which the majority of the students did not feel that extract
class refactoring was needed. As explained in the design, we also asked subjects why
they would (or would not) split each class. Analyzing these answers we found that
the subjects judged the complexity of both classes acceptable and were not able to
identify different responsibilities implemented in them. Clearly, this result contrasts
with the choice made by the original developers. However, analyzing the refactoring
performed by the original developers on the classes Select and Import it is clear that

Table 14 Answers provided by the subjects

Class % Students answering YES
Would you split Would you split the Was the provided refactoring
this class? (%) class differently? (%) suggestion useful? (%)
Database 87 62 100
Select 27 0 100
UserManager 100 87 67
FileGeneratorAdapter 67 40 100
Import 40 0 100
JEditTextArea 100 100 100
JFreeChart 100 0 100
NumberAxis 87 62 100
DefaultApplicationModel 80 8 100
XMLDTDValidator 100 0 100
XMLSerializer 87 67 100
Empir Software Eng (2014) 19:1617–1664 1655

their choice was not driven by the high complexity of those classes or by the high
number of responsibilities implemented in them, but rather by the desire to adhere
to a specific architectural style. In fact, these classes implement a quite low number of
methods (i.e., 14 in Select and 10 in Import) and the original developers performed
the extract class refactoring to split both classes into a Model class, responsible of
modeling an entity in the system, and a Controller class working on the Model.
These kinds of refactoring can be identified also by our approach, since generally
methods implementing a Model class generally share several instance variables, while
methods implementing a Controller class generally have a high number of method
calls among them, since co-operating in the implementation of some functionality.
In fact, for the Import class our approach proposes exactly the same refactoring
performed by the original developers and the six students that would split this class
accepted the refactoring suggested as is (and clearly, found the suggestion useful).
Moreover, three of them were also able to motivate their choice by explaining that
“the class Import seems a merge between a Model and a Controller”, “it is possible to
extract a model class”, and “to improve its reusability a model class can be extracted”.
As for the Select class, the four subjects that would split it accepted the suggestion of
our approach as is, explaining its usefulness with the fact that “the extracted classes
looks strongly cohesive”.
On the other hand, there are four classes that all subjects would like to split
(i.e., UserManager, JEditTextArea, JFreeChart, and XMLDTDValidator). For the
UserManager class, the subjects explained that this class is responsible for “more
than just managing the users” and they can identify “two dif ferent responsibilities
implemented in it”. Ten subjects (67 %) found the suggestion of our tool useful
explaining that “it eases code comprehension” by “highlighting the main responsi-
bilities implemented in the class”. Our approach splits each of these classes in two
new classes. It is interesting to note that 87 % of the subjects (13 out of 15) modified
the suggested refactoring solution and all of them moved just one method from one
of the extracted classes to the other obtaining exactly the refactoring performed by
the original developers. Concerning the 33 % of subjects that did not find useful
the suggestion of our approach for this class, most of them motivated this answer by
explaining that “the class was not really complex” and thus “its main responsibilities
can be identif ied without any suggestion”. However, none of them complained about
the quality of the proposed refactoring.
Interesting is also the case of the JEditTextArea class for which all subjects (i)
thought that a refactoring would be necessary, (ii) would like to change the proposed
refactoring, and (iii) thought that the suggested refactoring was useful as starting
point. As for the reason to split this class, most of the subjects explain it by highlight-
ing that “the class is very complex”, “intricate”, and “seems to have low cohesion”.
Our technique extracts from JEditTextArea three new classes. All subjects suggested
that two of these classes can be merged together and four of them also extracted
a new class managing “the scrolling of a text area”. However, all subjects found
the starting refactoring suggestion useful commenting that “the proposed division of
responsibilities makes sense, but perhaps it is a bit excessive”. This motivation explains
the fact that all of them merged together two of the three extracted classes. Looking
at the refactoring performed on this class by the original developers, we found that
the JEditTextArea was actually split in three new classes. However, two of the classes
extracted by our approach (i.e., the two generally merged together by subjects) were
1656 Empir Software Eng (2014) 19:1617–1664

grouped into one single class by the original developers. Thus, the choice made by
the subjects of merging together two of the three extracted classes looks reasonable.
For other classes like JFreeChart, and XMLDTDValidator, all students accepted
the refactoring suggestion as is, commenting in most cases that “the extracted
responsibilities were meaningful” and “cohesive classes were extracted”.
Finally, another interesting case concerns the NumberAxis class. Most of the
students (87 %) would like to split this class since “the management of the axis values
should be extracted”. Our approach suggested to split this class in three new classes.
While 38 % of students appreciated this suggestion, the other 62 % applied a change
to it by merging two of the three suggested classes. The students who applied the
change to the refactoring proposed our approach (just a join operation) were able to
replicate the refactoring performed by the original developers. Overall, all subjects
appreciated the refactoring suggestion highlighting as it “eases the comprehension of
the main responsibilities implemented in a class”.
Summarizing, except for the case of the UserManager class discussed above, sub-
jects always found useful the solutions suggested by our approach when performing
refactoring. Among the most frequent explanations we found:
1. it eases code comprehension;
2. it highlights the main responsibilities implemented in a class;
3. the extracted classes are cohesive.
Moreover, subjects stated that in some cases “without the refactoring suggestion it
would be too dif f icult to identify the main responsibilities of the classes”.
Concerning the quantitative data, Table 15 reports the average MoJoFM between
(i) the refactoring suggested by our approach and that performed by the original
developers, (ii) the refactoring performed by the subjects and the refactoring sug-

Table 15 MoJoFM between (i) the refactoring suggested by our approach and that performed by
the original developers (ii) the refactoring performed by subjects and the refactoring proposed by
our approach, and (iii) the refactoring performed by subjects and that performed by the original
developers
Class Our approach to Subjects to our Subjects to
original dev. approach original dev.
MoJoFM #Move/Join Avg. Avg. Avg. Avg.
MoJoFM #Move/Join MoJoFM #Move/Join
Database (41) 0.97 1 0.98 0.6 0.99 0.5
Select (14) 0.83 2 1.0 0 0.83 2
UserManager (13) 0.93 1 0.94 0.9 0.98 0.3
FileGeneratorAdapter (9) 0.86 1 0.92 0.6 0.94 0.4
Import (10) 1.00 0 1.00 0 1.00 0
JEditTextArea (214) 0.84 34 0.97 6 0.87 27
JFreeChart (24) 0.95 1 1.00 0 0.95 1
NumberAxis (20) 0.94 1 0.96 0.6 0.98 0.4
DefaultApplicationModel (14) 0.92 1 0.99 0.2 0.92 1
XMLDTDValidator (69) 0.88 8 1.00 0 0.88 8
XMLSerializer (25) 0.91 2 0.97 0.7 0.94 1.4
Average 0.91 4.7 0.98 0.9 0.93 3.8
In parenthesis the number of methods of the class
Empir Software Eng (2014) 19:1617–1664 1657

gested by our approach and (iii) the refactoring performed by the subjects (starting
from the suggestions of our approach) and the refactoring performed by the original
developers. Moreover, Table 15 also reports the number of Move/Join operations
needed to convert one refactoring into the other.
The first thing that stands out is that our approach approximates extremely well
the refactoring performed by the original developers, achieving on average 0.91 of
MoJoFM. For example, for the Database class from the Apache HSQLDB system
our approach achieves 0.97 MoJoFM value. This class was split in 2 new classes by the
original developers (see Table 6), one containing 27 and one containing 13 methods.
Our approach splits the Database class into three classes. The first is composed of the
same 27 methods included in one of the two classes extracted by the developers. The
other two extracted classes contains the remaining 13 methods, 8 in one class and 5
in another one. Thus, by performing only one Join operation (i.e., merging the two
smaller classes extracted) it is possible to obtain the refactoring performed by the
original developers.
On average, 4.7 Move/Join operations are required to convert the refactoring
suggested by our approach into the refactoring performed by the original developers
(note that the median is 1). The only case in which a quite high number of Move/Join
operations is required to convert the refactoring solution proposed by our approach
to the one performed by the developers is related to the class JEditTextArea. In
this case, 34 Move/Join operations are required. However, it is worth noting that
the original class was composed of 214 methods. Thus, in this case 34 Move/Join
operations required to convert the refactoring proposed by our approach into the one
performed by the developers represents a reasonably good result, as demonstrated
also by the high MoJoFM value achieved (0.84).
To have a benchmark, we compared the performance of our approach with that
achieved by the Max Flow-Min Cut approach (Bavota et al. 2011). The gap in
performance between the two approaches is very large, 0.91 for our approach versus
0.62 achieved by the Max Flow-Min Cut approach. An interesting case is represented
by the JEditTextArea class, since it is the only one split by the original developers into
three new classes. In this case our approach achieves 0.84 of MojoFM, against the 0.74
achieved by the Max Flow-Min Cut approach. We also iteratively applied the Max
Flow-Min Cut approach as explained in Section 4.4. In this case the MoJoFM even
decreases to 0.71, thus again demonstrating the unsuitability of iteratively applying
the Max Flow-Min Cut approach.
The second important result of our study is that the refactorings suggested by our
approach are only slightly modified by the 15 subjects. In fact, the average MoJoFM
value is 0.98 and the average number of required Move/Join operations to convert the
suggestion by our approach in the refactoring performed by subjects is less than one
(0.9). Moreover, starting from the suggestions by our approach, subjects were able to
further approximate the refactoring performed by the original developers achieving
an average MoJoFM value of 0.93. On average, only 3.8 Move/Join operations are
needed to convert their refactoring into the refactoring performed by the original
developers.
This result highlights how the refactoring solutions suggested by our approach
represent a very good starting point for developers interested in performing extract
class refactoring operations. In fact, students with no knowledge of the object systems
were able to comprehend the classes and perform refactoring operations very close
1658 Empir Software Eng (2014) 19:1617–1664

to those performed by the original developers, who obviously have a much deeper
knowledge of these systems.

6.3 Threats to Validity

Here we discuss the main threats that could affect the validity of our results from this
study.

6.3.1 Subjects and Design of the User Study

As in our previous study with users in Section 5, in this case our subjects were Master
students. This clearly results in the same threats related to their knowledge of the
source code under analysis. However, using the refactoring solutions suggested by
our approach, they were able to produce refactoring very close to those performed
by the original developers.
Another threat is related to the fact that we did not perform the experimentation
in a controlled setting, as we invited subjects to participate via e-mail. However,
unlike the study presented in Section 5 which simply required to score refactoring
solutions, performing this type of study in a controlled setting is rather unfeasible
since the time needed to refactor 11 classes and provide all the required qualitative
feedback is much higher. The subjects spent between 6 and 10 hours to perform the
required tasks.14 To mitigate this issue, we instructed the subjects to refrain from
checking out other versions of the software systems, so that they cannot see the
refactoring done by the original developers. However, no monitoring was in place
to ensure that they followed our instructions.

6.3.2 Reliability of the Considered Oracle

In this study we considered the refactoring performed by the original developers as
the baseline to test the refactoring proposed by our approach. This is, to the best
of our knowledge, the first refactoring approach evaluated against real refactoring
operations performed by original developers of open source systems. However,
we do not know (i) the experience of the developer who actually performed the
refactoring and (ii) how the developer performed the refactoring (e.g., manually,
using a tool suggestion, etc.). We also have no information on their design rationales.
Concerning the first point, the open source community working on the these
projects accepted the performed refactoring operations without modifying them in
future versions. This makes us at least confident that the performed refactoring
operations were correct and meaningful.
Concerning the second point, we analyzed the commit messages wrote by the
developers when uploading the changes resulting from the refactoring operation in
the software repository. We did not find any claim about the usage of tools to perform
the refactoring, although we cannot be 100 % sure that this was not the case.

14 This data was provided by the subjects when sending their results to us.
Empir Software Eng (2014) 19:1617–1664 1659

6.3.3 On the Low Number of Refactoring Operations Analyzed

To identify refactoring operations performed by the original developers we mined
the history of six software systems. However, we only identified 11 extract class refac-
toring operations. This result is quite surprising but can be explained by analyzing
the process we used to identify the extract class operations. First, the tool we used to
identify refactoring operations (i.e., Ref-Finder) is not able to directly identify cases
of extract class refactoring, but only sets of move method and move field operations.
We manually validated these sets to recognize them as extract class refactoring
operations, so it is possible that we missed the identification of some extract class
refactoring operations. Second, and most important, we found several operations
which include a mixture of different refactoring operations. For example, often some
methods of a Blob class are deleted, others are moved to an existing class (i.e., move
method) and the remaining are split in new classes (i.e., extract class). Also, in other
cases, extract class refactoring was performed in the context of evolution activities
where new features were added to the extracted classes. We decided to ignore these
cases since they do not represent pure extract class operations and evaluating an
extract class refactoring approach in these cases would be unreliable.

7 Conclusion

This paper proposes an approach to automate Extract Class refactoring. Given a

class to be refactored, the approach suggests how to split the class into new ones with
higher cohesion than the original class and with a minimal increase in coupling.
A comprehensive empirical evaluation of the proposed approach was performed
through a set of studies. A study on artificially created Blobs helped us define and
evaluate a heuristic that is to be used to set the parameters of the approach. Then
we evaluated our approach in two empirical studies with users. In the first study we
evaluated the quality of the refactoring solutions proposed by our approach. In par-
ticular, we conducted a user study with 50 graduate students asking them to rate the
refactoring suggested by the proposed approach on 17 Blobs from two open-source
systems. We also evaluated the impact of the refactoring solutions proposed by our
approach on the cohesion and coupling of the classes involved in the refactoring.
The second study was carried out on eleven classes from six open source systems,
which actually underwent extract class refactorings. In this study we performed a
more qualitative user study to understand to what extent the refactoring solutions
suggested by our approach are useful to developers performing extract class refac-
toring. Moreover, we evaluated how well our approach is able to approximate the
refactoring made by the original developers.
To the best of our knowledge the studies reported in this paper represent the most
extensive evaluation of a refactoring technique available in the literature. We paid
much attention to evaluations conducted with developers since we strongly believe
that a good refactoring approach must suggest refactoring solutions considered
meaningful and useful by developers. This is the reason why we performed two user
studies, with a total of 65 subjects, to qualitatively evaluate the refactoring solutions
proposed by our approach. Moreover, the study conducted to evaluate how well a
refactoring approach approximates the refactoring performed by original developers
of open source systems also represents a premiere in the literature.
1660 Empir Software Eng (2014) 19:1617–1664

The results of our studies indicate that the refactoring solutions proposed by
our approach (i) strongly increase the cohesion of the refactored classes without
leading to a significant increase in coupling, (ii) are useful to developers performing
extract class refactoring and (iii) approximate well refactorings manually performed
by original developers of open source systems. At the same time, the new approach
outperforms a previously proposed technique (Bavota et al. 2011) for extract class
refactoring.

Acknowledgements We would like to thank all the students who participated to our studies. We
would also like to thank anonymous reviewers for their careful reading of our manuscript and
high-quality feedback. Their detailed comments have helped us to substantially revise, extend, and
improve the original version of this paper. Andrian Marcus was supported in part by grants from the
US National Science Foundation (CCF-0845706 and CCF-1017263).

References

Abadi A, Ettinger R, Feldman YA (2009) Fine slicing for advanced method extraction. In: 3rd
workshop on refactoring tools
Abdeen H, Ducasse S, Sahraoui HA, Alloui I (2009) Automatic package coupling and cycle min-
imization. In: Proceedings of the 16th working conference on reverse engineering. IEEE CS
Press, Lille, pp 103–112
Anquetil N, Fourrier C, Lethbridge TC (1999) Experiments with clustering as a software remodular-
ization method. In: Proceedings of the 6th working conference on reverse engineering. IEEE CS
Press, Atlanta, GA, pp 235–255
Arisholm E, Sjoberg D (2004) Evaluating the effect of a delegated versus centralized control style
on the maintainability of object-oriented software. IEEE Trans Softw Eng 30(8):521–534
Atkinson DC, King T (2005) Lightweight detection of program refactorings. In: Proceedings of the
12th Asia-Pacific software engineering conference. IEEE CS Press, Taipei, pp 663–670
Baeza-Yates R, Ribeiro-Neto B (1999) Modern information retrieval. Addison-Wesley
Basili VR, Briand L, Melo WL (1995) A validation of object-oriented design metrics as quality
indicators. IEEE Trans Softw Eng 22(10):751–761
Bavota G, De Lucia A, Oliveto R (2011) Identifying extract class refactoring opportunities using
structural and semantic cohesion measures. J Syst Softw 84:397–414
Bavota G, Lucia AD, Marcus A, Oliveto R (2010) A two-step technique for extract class refactoring.
In: Proceedings of 25th IEEE international conference on automated software engineering,
pp 151–154
Bavota G, Lucia AD, Marcus A, Oliveto R (2012) Automating extract class refactoring: an
improved approach and its evaluation. Online appendix [Link]
[Link]
Binkley AB, Schach SR (1998) Validation of the coupling dependency metric as a predictor of run-
time failures and maintenance measures. In: Proceedings of the 20th international conference on
software engineering. Kyoto, Japan, pp 452–455
Bodhuin T, Canfora G, Troiano L (2007) SORMASA: a tool for suggesting model refactoring actions
by metrics-led genetic algorithm. In: Proceedings of 1st workshop on refactoring tools. Berlin,
Germany, pp 23–24
Briand LC, Wuest J, Lounis H (1999a) Using coupling measurement for impact analysis in object-
oriented systems. In: Proceedings of the 15th IEEE international conference on software main-
tenance. IEEE Press, Oxford, pp 475–482
Briand LC, Wüst J, Ikonomovski SV, Lounis H (1999b) Investigating quality factors in object-
oriented designs: an industrial case study. In: Proceedings of the 21st international conference
on software engineering. ACM Press, Los Angeles, CA, pp 345–354
Brown WJ, Malveau RC, Brown WH, McCormick III HW, Mowbray TJ (1998) Anti patterns:
refactoring software, architectures, and projects in crisis, 1st edn. John Wiley and Sons
Empir Software Eng (2014) 19:1617–1664 1661

Canfora G, Cimitile A, De Lucia A, Di Lucca GA (2001) Decomposing legacy systems into objects:
an eclectic approach. Inf Softw Technol 43(6):401–412
Casais E (1992) An incremental class reorganization approach. In: Proceedings of the 6th European
conference on object-oriented programming. Utrecht, the Netherlands, pp 114–132
Chidamber SR, Kemerer CF (1994) A metrics suite for object oriented design. IEEE Trans Softw
Eng 20(6):476–493
Christl A, Koschke R, Storey MA (2007) Automated clustering to support the reflexion method. Inf
Softw Technol 49(3):255–274
Cohen J (1988) Statistical power analysis for the behavioral sciences, 2nd edn. Lawrence Earlbaum
Associates
Conover WJ (1998) Practical nonparametric statistics, 3rd edn. Wiley
Cormen TH, Leiserson CE, Rivest RL, Stein C (2001) Introduction to algorithms, 2nd edn, chap 26
(maximum flow). MIT Press and McGraw-Hill
Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990) Indexing by latent
semantic analysis. J Am Soc Inf Sci 41(6):391–407
van Deursen A, Kuipers T (1999) Identifying objects using cluster and concept analysis. In: Proceed-
ings of the 21st international conference on software engineering. ACM Press, Los Angeles, CA,
pp 246–255
Du Bois B, Demeyer S, Verelst J (2004) Refactoring—improving coupling and cohesion of existing
code. In: Proceedings of 11th working conference on reverse engineering. IEEE CS Press, Delft,
pp 144–151
Fokaefs M, Tsantalis N, Chatzigeorgiou A, Sander J (2009) Decomposing object-oriented class
modules using an agglomerative clustering technique. In: Proceedings of the 25th international
conference on software maintenance. Edmonton, Canada, pp 93–101
Fowler M (1999) Refactoring: improving the design of existing code. Addison-Wesley
Girard JF, Koschke R (2000) A comparison of abstract data types and objects recovery techniques.
Sci Comput Program 36(2–3):149–181
Gui G, Scott PD (2006) Coupling and cohesion measures for evaluation of component reusability.
In: Proceedings of the 5th international workshop on mining software repositories. ACM Press,
Shanghai, pp 18–21
Gyimóthy T, Ferenc R, Siket I (2005) Empirical validation of object-oriented metrics on open source
software for fault prediction. IEEE Trans Softw Eng 31(10):897–910
Joshi P, Joshi RK (2009) Concept analysis for class cohesion. In: Proceedings of the 13th European
conference on software maintenance and reengineering. Kaiserslautern, Germany, pp 237–240
Khomh F, Vaucher S, Guéhéneuc YG, Sahraoui H (2009) A bayesian approach for the detection of
code and design smells. In: Proceedings of the 9th international conference on quality software.
IEEE CS Press, Hong Kong, pp 305–314
Khomh F, Vaucher S, Guéhéneuc YG, Sahraoui H (2009) A bayesian approach for the detection
of code and design smells. In: Proceedings of the 2009 ninth international conference on quality
software. IEEE Computer Society, Washington, DC, pp 305–314
Koschke R, Canfora G, Czeranski J (2006) Revisiting the delta ic approach to component recovery.
Sci Comput Program 60(2):171–188
Kuhn A, Ducasse S, Gîrba T (2007) Semantic clustering: identifying topics in source code. Inf Softw
Technol 49(3):230–243
Lee Y, Liang B, Wu S, Wang F (1995) Measuring the coupling and cohesion of an object-oriented
program based on information flow. In: Proceedings of the international conference on software
quality. Maribor, Slovenia, pp 81–90
Li W, Henry S (1993) Maintenance metrics for the object oriented paradigm. In: Proceedings of the
first international software metrics symposium, pp 52–60
Liu Y, Poshyvanyk D, Ferenc R, Gyimóthy T, Chrisochoides N (2009) Modelling class cohesion as
mixtures of latent topics. In: Proceedings of the 25th IEEE international conference on software
maintenance. IEEE Press, Edmonton, pp 233–242
Maletic JI, Marcus A (2001) Supporting program comprehension using semantic and structural
information. In: Proceedings of the 23rd international conference on software engineering. IEEE
CS Press, Toronto, ON, pp 103–112
Marcus A, Poshyvanyk D, Ferenc R (2008) Using the conceptual cohesion of classes for fault
prediction in object-oriented systems. IEEE Trans Softw Eng 34(2):287–300
1662 Empir Software Eng (2014) 19:1617–1664

Marinescu R (2004) Detection strategies: metrics-based rules for detecting design flaws. In: Pro-
ceedings of the 20th IEEE international conference on software maintenance. IEEE Computer
Society, Washington, DC, pp 350–359
Maruyama K, Shima K (1999) Automatic method refactoring using weighted dependence graphs.
In: Proceedings of 21st international conference on software engineering. ACM Press,
Los Alamitos, CA, pp 236–245
Mens T, Tourwe T (2004) A survey of software refactoring. IEEE Trans Softw Eng 30(2):126–139
Moha N, Gueheneuc YG, Duchien L, Le Meur AF (2010) Decor: a method for the specification and
detection of code and design smells. IEEE Trans Softw Eng 36(1):20–36
Moore I (1996) Automatic inheritance hierarchy restructuring and method refactoring. In: Proceed-
ings of 11th ACM SIGPLAN conference on object-oriented programming, systems, languages,
and applications. ACM Press, San Jose, CA, pp 235–250
O’Keeffe M, O’Cinneide M (2006) Search-based software maintenance. In: Proceedings of 10th
European conference on software maintenance and reengineering. IEEE CS Press, Bari, pp 249–
260
Olbrich S, Cruzes DS, Basili, V, Zazworka N (2009) The evolution and impact of code smells: a case
study of two open source systems. In: Proceedings of the 2009 3rd international symposium on
empirical software engineering and measurement, ESEM ’09, pp 390–400
Oliveto R, Gethers M, Bavota G, Poshyvanyk D, Lucia A (2011) Identifying method friendships to
remove the feature envy bad smell (nier track). In: 33rd IEEE/ACM international conference
on software engineering—NIER Track. ACM Press, Hawaii, USA, pp 820–823
Oppenheim AN (1992) Questionnaire design, interviewing and attitude measurement. Pinter
Publishers
Poshyvanyk D, Marcus A, Ferenc R, Gyimóthy T (2009) Using information retrieval based coupling
measures for impact analysis. Empir Software Eng 14(1):5–32
Praditwong K, Harman M, Yao X (2011) Software module clustering as a multi-objective search
problem. IEEE Trans Softw Eng 37(2):264–282
Prete K, Rachatasumrit N, Sudan N, Kim M (2010) Template-based reconstruction of complex
refactorings. In: 26th IEEE international conference on software maintenance (ICSM 2010).
IEEE Computer Society, Timisoara, 12–18 September 2010, pp 1–10
Sartipi K, Kontogiannis K (2001) Component clustering based on maximal association. In: Proceed-
ings of the 8th working conference on reverse engineering. Stuttgart, Germany, pp 103–114
Seng O, Bauer M, Biehl M, Pache G (2005) Search-based improvement of subsystem decompo-
sitions. In: Proceedings of the genetic and evolutionary computation conference. ACM Press,
Washington, DC, pp 1045–1051
Seng O, Stammel J, Burkhart D (2006) Search-based determination of refactorings for improving
the class structure of object-oriented systems. In: Proceedings of the genetic and evolutionary
computation conference. Seattle, Washington, USA, pp 1909–1916
Simon F, Steinbr F, Lewerentz C (2001) Metrics based refactoring. In: Proceedings of the 5th
European conference on software maintenance and reengineering. IEEE CS Press, Lisbon,
pp 30–38
Stevens W, Myers G, Constantine L (1974) Structured design. IBM Syst J 13(2):115–139
Stewart KJ, Darcy DP, Daniel SL (2006) Opportunities and challenges applying functional data
analysis to the study of open source software evolution. Stat Sci 21(2):167–178
Tahvildari L, Kontogiannis K (2003) A metric-based approach to enhance design quality through
meta-pattern transformation. In: Proceedings of the 7st European conference on software main-
tenance and reengineering. Benevento, Italy, pp 183–192
Tonella P (2001) Concept analysis for module restructuring. IEEE Trans Softw Eng 27(4):351–363
Trifu A, Marinescu R (2005) Diagnosing design problems in object oriented systems. In: Proceedings
of the 12th working conference on reverse engineering. IEEE Press, Pittsburgh, PA, pp 155–164
Tsantalis N, Chatzigeorgiou A (2009) Identification of move method refactoring opportunities. IEEE
Trans Softw Eng 35(3):347–367
Wen Z, Tzerpos V (2004) An effectiveness measure for software clustering algorithms. In: Proceed-
ings of the 12th IEEE international workshop on program comprehension, IWPC ’04. IEEE
Computer Society, pp 194–203
Wiggerts TA (1997) Using clustering algorithms in legacy systems remodularization. In: Proceedings
of the 4th working conference on reverse engineering. IEEE CS Press, Amsterdam, pp 33–43
WRT (2011) 2011 International Workshop on Refactoring Tools. [Link]
Accessed 22 April 2013
Empir Software Eng (2014) 19:1617–1664 1663

Gabriele Bavota received the Laurea in Computer Science (cum laude) from the University of
Salerno (Italy) in 2009. He received the PhD in Computer Science from the University of Salerno
(Italy) in 2013. He is currently research fellow at the Department of Engineering of the University
of Sannio. He is member of the Software Engineering Lab at the University of Salerno. His
research interests include refactoring and re-modularization, software maintenance and evolution,
and empirical software engineering. He serves and has served on the organising and program
committees of international conferences in the field of software engineering. He is member of IEEE.

Andrea De Lucia received the laurea degree in computer science from the University of Salerno,
Italy, in 1991, the MSc degree in computer science from the University of Durham, UK, in 1996,
and the PhD degree in electronic engineering and computer science from the University of Naples
“Federico II”, Italy, in 1996. He is a full professor of software engineering and the Director of the
International Summer School on Software Engineering at the University of Salerno. Previously,
he was with the Department of Engineering and the Research Centre on Software Technology
(RCOST) at the University of Sannio. His research interests include software maintenance, program
comprehension, reverse engineering, reengineering, migration, global software engineering, software
configuration management, workflow management, document management, empirical software
engineering, visual languages, web engineering, and e-learning. He has published more than 150
papers on these topics in international journals, books, and conference proceedings and has edited
books and journal special issues. He serves on the editorial board of Journal of Software: Evolution
and Process and other international journals and on the organizing and program committees of
several international conferences in the field of software engineering. Prof. De Lucia is a senior
member of the IEEE and the IEEE Computer Society. He was also at-large member of the executive
committee of the IEEE Technical Council on Software Engineering (TCSE) and committee member
of the IEEE Real World Engineering Project (RWEP) Program.
1664 Empir Software Eng (2014) 19:1617–1664

Andrian Marcus is Associate Professor and Director of the Undergraduate Program in the
Department of Computer Science at Wayne State University (Detroit, MI). He obtained his PhD in
Computer Science from Kent State University in 2003. His current research interests are in software
engineering, with focus on using information retrieval and text mining techniques for software
analysis to support comprehension during software evolution. He served on the Steering Committee
of the IEEE International Conference on Software Maintenance (ICSM) between 2005–2008 and
2011–2014, and on the Steering Committee IEEE International Wokshop on Visualizing Software for
Understanding and Analysis (VISSOFT) between 2005–2009. He serves on the editorial board of the
Empirical Software Engineering and the Journal of Software: Evolution and Process. He also served
as organizing or program committee member to many conferences related to his area of research.

Rocco Oliveto is Assistant Professor in the Department of Bioscience and Territory at University of
Molise (Italy). He is the Director of the Laboratory of Informatics and Computational Science of the
University of Molise. He received the PhD in Computer Science from University of Salerno (Italy) in
2008. From 2008 to 2010 he was research fellow at the Department of Mathematics and Informatics
of University of Salerno. From 2005 to 2010 he is also adjunct professor at the Faculty of Science of
University of Molise (Italy). In 2011 he joined the STAT Department of University of Molise. His
research interests include traceability management, information retrieval, software maintenance and
evolution, search-based software engineering, and empirical software engineering. He has published
more than 50 papers on these topics in international journals, books, and conference proceedings. He
serves and has served as organizing and program committee member of international conferences in
the field of software engineering. In particular, he was the program co-chair of TEFSE 2009, the
Traceability Challenge Chair of TEFSE 2011, the Industrial Track Chair of WCRE 2011, the Tool
Demo Co-chair of ICSM 2011, the program co-chair of WCRE 2012, and he will be the program
co-chair of WCRE 2013. Dr. Oliveto is member of IEEE Computer Society, ACM, and IEEE-CS
Awards and Recognition Committee.

Eclipse ARIES: Extract Class Refactoring
No ratings yet
Eclipse ARIES: Extract Class Refactoring
4 pages
Extract Class Refactoring with Cohesion Metrics
No ratings yet
Extract Class Refactoring with Cohesion Metrics
18 pages
Class Diagram Restructuring Case Study
No ratings yet
Class Diagram Restructuring Case Study
9 pages
Dynamic Interactive Software Refactoring
No ratings yet
Dynamic Interactive Software Refactoring
30 pages
Understanding Refactoring Dependencies
No ratings yet
Understanding Refactoring Dependencies
15 pages
Common Refactoring Methods in Software
No ratings yet
Common Refactoring Methods in Software
5 pages
Multi-Criteria Code Refactoring Using Search-Based Software Engineering
No ratings yet
Multi-Criteria Code Refactoring Using Search-Based Software Engineering
53 pages
Visualizing Class Refactoring Techniques
No ratings yet
Visualizing Class Refactoring Techniques
10 pages
A Survey of Software Refactoring: Tom Mens, Member, IEEE, and Tom Tourwe
No ratings yet
A Survey of Software Refactoring: Tom Mens, Member, IEEE, and Tom Tourwe
15 pages
Search-Based Refactoring via Graph Unfolding
No ratings yet
Search-Based Refactoring via Graph Unfolding
15 pages
Understanding Refactoring in Software
No ratings yet
Understanding Refactoring in Software
51 pages
GantProject Refactoring Strategies
No ratings yet
GantProject Refactoring Strategies
11 pages
Refactoring Case Study: Milestone 3 Insights
No ratings yet
Refactoring Case Study: Milestone 3 Insights
23 pages
A STUDY OF JAVA REFACTORING TOO - Qureshi, Dr. Ashad Ullah
No ratings yet
A STUDY OF JAVA REFACTORING TOO - Qureshi, Dr. Ashad Ullah
143 pages
Extract Method Refactoring Insights
No ratings yet
Extract Method Refactoring Insights
21 pages
Essential Refactoring Techniques Overview
No ratings yet
Essential Refactoring Techniques Overview
36 pages
Refactoring: Enhancing Code Design
No ratings yet
Refactoring: Enhancing Code Design
43 pages
Refactoring for Code Reusability
No ratings yet
Refactoring for Code Reusability
7 pages
Enhancing Code Comprehensibility in Software
No ratings yet
Enhancing Code Comprehensibility in Software
28 pages
Automated Refactoring Detection Algorithm
No ratings yet
Automated Refactoring Detection Algorithm
25 pages
Does The Refactoring of Code Affect The Performance of An Application?
No ratings yet
Does The Refactoring of Code Affect The Performance of An Application?
18 pages
Identifying Code Smells and Refactoring
No ratings yet
Identifying Code Smells and Refactoring
53 pages
Automated Software Modularization Improvement
No ratings yet
Automated Software Modularization Improvement
33 pages
Software Maintenance: Refactoring Insights
No ratings yet
Software Maintenance: Refactoring Insights
15 pages
Overview of Software Refactoring Techniques
No ratings yet
Overview of Software Refactoring Techniques
4 pages
Hybrid Clustering for Software Architecture Recovery
No ratings yet
Hybrid Clustering for Software Architecture Recovery
5 pages
Multi-Agent Framework for Code Refactoring
No ratings yet
Multi-Agent Framework for Code Refactoring
26 pages
Essential Guide to Software Refactoring
No ratings yet
Essential Guide to Software Refactoring
23 pages
Understanding Refactoring in Software
No ratings yet
Understanding Refactoring in Software
44 pages
Refactoring in Software Development
No ratings yet
Refactoring in Software Development
9 pages
Understanding Code Refactoring Techniques
No ratings yet
Understanding Code Refactoring Techniques
51 pages
Enhancing Software Modularization Techniques
No ratings yet
Enhancing Software Modularization Techniques
32 pages
Software Engineering Refactoring Techniques
No ratings yet
Software Engineering Refactoring Techniques
7 pages
SVM-Based Software Refactoring Prediction
No ratings yet
SVM-Based Software Refactoring Prediction
10 pages
Impact of Refactoring on Code Complexity
No ratings yet
Impact of Refactoring on Code Complexity
6 pages
Methodbook: Enhancing Move Method Refactoring
No ratings yet
Methodbook: Enhancing Move Method Refactoring
24 pages
Refactoring Patterns and IntelliJ Tools
No ratings yet
Refactoring Patterns and IntelliJ Tools
3 pages
Software Evolution Analysis of AngularJS
No ratings yet
Software Evolution Analysis of AngularJS
22 pages
RMove Recommending Move Method Refactoring
No ratings yet
RMove Recommending Move Method Refactoring
13 pages
Overview of Software Refactoring Techniques
No ratings yet
Overview of Software Refactoring Techniques
14 pages
A Case Study Assumptions and Limitations of A Problem-Solving Method
No ratings yet
A Case Study Assumptions and Limitations of A Problem-Solving Method
37 pages
XML-Based Reverse Engineering System
No ratings yet
XML-Based Reverse Engineering System
4 pages
Multi-Objective Genetic Algorithms for Class Responsibility Assignment
No ratings yet
Multi-Objective Genetic Algorithms for Class Responsibility Assignment
48 pages
Refactoring Rhythms and Tactics Study
No ratings yet
Refactoring Rhythms and Tactics Study
17 pages
Refactoring Techniques and Guidelines
No ratings yet
Refactoring Techniques and Guidelines
34 pages
Software Restructuring and Refactoring Guide
No ratings yet
Software Restructuring and Refactoring Guide
27 pages
Class and Object Relationships Explained
No ratings yet
Class and Object Relationships Explained
45 pages
Refactoring Sequence Impact on Maintainability
No ratings yet
Refactoring Sequence Impact on Maintainability
6 pages
Code Smells and Refactoring Guide
No ratings yet
Code Smells and Refactoring Guide
27 pages
Object Model from Legacy FORTRAN Code
No ratings yet
Object Model from Legacy FORTRAN Code
10 pages
Refactoring Techniques Order Impact on Code
No ratings yet
Refactoring Techniques Order Impact on Code
6 pages
Smalltalk Refactoring Tool Overview
No ratings yet
Smalltalk Refactoring Tool Overview
16 pages
A Pattern Language For Reverse Engineering: Serge Demeyer, Stéphane Ducasse, Oscar Nierstrasz
No ratings yet
A Pattern Language For Reverse Engineering: Serge Demeyer, Stéphane Ducasse, Oscar Nierstrasz
20 pages
Automating Legacy Application Refactoring
No ratings yet
Automating Legacy Application Refactoring
2 pages
Refactoring Techniques and Code Smells
No ratings yet
Refactoring Techniques and Code Smells
54 pages
Understanding Refactoring Practices
No ratings yet
Understanding Refactoring Practices
23 pages
Types of Refactoring Techniques
No ratings yet
Types of Refactoring Techniques
4 pages
Enhance HTMX Apps with Optimistic UI
No ratings yet
Enhance HTMX Apps with Optimistic UI
10 pages
Code Smell Co-occurrence Study Insights
No ratings yet
Code Smell Co-occurrence Study Insights
26 pages
Islands to Continent: Complexity Shift
No ratings yet
Islands to Continent: Complexity Shift
17 pages
Code Smell Co-occurrences in Android Apps
No ratings yet
Code Smell Co-occurrences in Android Apps
8 pages
Extract Package Refactoring in ARIES Tool
No ratings yet
Extract Package Refactoring in ARIES Tool
4 pages
Energy Efficiency in Mobile IoT Apps
No ratings yet
Energy Efficiency in Mobile IoT Apps
77 pages
Migrating Legacy Systems to SOA Challenges
No ratings yet
Migrating Legacy Systems to SOA Challenges
12 pages
Dead Reckoning & Heartbeat in DIS
No ratings yet
Dead Reckoning & Heartbeat in DIS
8 pages
Math Problem Solving Questions
No ratings yet
Math Problem Solving Questions
8 pages
Network Centrality in Human Connectome
No ratings yet
Network Centrality in Human Connectome
14 pages
Machine Learning Techniques Exam Paper
No ratings yet
Machine Learning Techniques Exam Paper
5 pages
Trigonometric Proofs and Identities
No ratings yet
Trigonometric Proofs and Identities
1 page
Understanding Coplanar Forces
No ratings yet
Understanding Coplanar Forces
12 pages
Decision Support Systems Overview
No ratings yet
Decision Support Systems Overview
5 pages
Capgemini Campus Drive Questions
No ratings yet
Capgemini Campus Drive Questions
3 pages
Understanding Stacks in Data Structures
No ratings yet
Understanding Stacks in Data Structures
39 pages
Number Systems and Mathematical Concepts
No ratings yet
Number Systems and Mathematical Concepts
27 pages
Basic Probability Rules Explained
No ratings yet
Basic Probability Rules Explained
11 pages
Nonlinear Elliptic Problems Overview
No ratings yet
Nonlinear Elliptic Problems Overview
52 pages
Asymptotic Stability Theorem For Autonomous Systems - Mukherjee Chen
No ratings yet
Asymptotic Stability Theorem For Autonomous Systems - Mukherjee Chen
3 pages
Advanced Tags Overview and Usage Guide
No ratings yet
Advanced Tags Overview and Usage Guide
39 pages
Introduction to Racket Functions
No ratings yet
Introduction to Racket Functions
37 pages
Statistics in Educational Research
No ratings yet
Statistics in Educational Research
53 pages
Monitoring SAP Periodic Process Chains
No ratings yet
Monitoring SAP Periodic Process Chains
8 pages
Understanding Structural Efforts and Stresses
No ratings yet
Understanding Structural Efforts and Stresses
35 pages
SSC CGL Tier 1 & 2 Exam Patterns 2024
No ratings yet
SSC CGL Tier 1 & 2 Exam Patterns 2024
8 pages
Stochastic Fractional Systems with Poisson-Jump
No ratings yet
Stochastic Fractional Systems with Poisson-Jump
25 pages
Dynamic Binding in C++ Explained
No ratings yet
Dynamic Binding in C++ Explained
14 pages
Solving Linear Equations Techniques
No ratings yet
Solving Linear Equations Techniques
40 pages
Understanding Mathematical Induction
No ratings yet
Understanding Mathematical Induction
153 pages
VCG Mechanism and SCF Existence Challenges
No ratings yet
VCG Mechanism and SCF Existence Challenges
6 pages
Longest Palindromic Subsequence with k Ops
No ratings yet
Longest Palindromic Subsequence with k Ops
2 pages
PSIM User Manual
No ratings yet
PSIM User Manual
310 pages
Boolean Algebra and Circuit Simplification
100% (1)
Boolean Algebra and Circuit Simplification
20 pages
Ultralight Metal Structures Crashworthiness
No ratings yet
Ultralight Metal Structures Crashworthiness
275 pages
Fall 2024 Math Course Schedule
No ratings yet
Fall 2024 Math Course Schedule
1 page
Trigonometric Ratios and Identities Guide
No ratings yet
Trigonometric Ratios and Identities Guide
19 pages