0% found this document useful (0 votes)
30 views14 pages

Unit-1 Software Effort Estimation

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views14 pages

Unit-1 Software Effort Estimation

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

UNIT-I

Software Effort Estimation :


Introduction, estimations, Problems with over and under estimation, The basis for software
estimating. Software effort estimation techniques, Bottom-up estimating, The top-down approach and
parametric models, Expert judgement, Estimating by analogy ,Albrecht function point analysis,
Function points Mark, COSMIC full function Points, COCOMO II: A Parametric productivity model.

 INTRODUCTION
A successful is one delivered ‘on time , within budget and with the required quality’. This implies
that targets are set which the project manager then tries to meet. This assumes that the targets are
reasonable- no account is taken of the possibility of project managers achieving record levels of
productivity from their teams, but still not meeting a deadline because of the incorrect initial
estimates. Realistic estimates are therefore crucial.
A project manager like Amanda has to produce estimates of effort, which affect costs, and of
activity durations, which affect the delivery time. Theses could be different, as in the case where
two testers work on the same task for the same five days.
Some of difficulties of estimating arise from the complexity and invisibility of software. Also, the
intensely human activities which make up system development cannot be treated in a purely
mechanistic way. Other difficulties include:

 Subjective nature of estimating


 Political implication
 Changing technology
 Lack of homogeneity of project experience
Subjective nature of estimating
For example, some research shows that people tend to underestimate the difficulty of small
tasks and over-estimate that of large ones.
Political implication
Different groups within an organization have different objectives. To avoid these ‘political
influences , one suggestion is that estimates be produced by a specialist estimating group, independent
of the users and the project team.

Changing technology
Where technologies change rapidly, it is difficult to use the experience of previous projects on
new ones.
Lack of homogeneity of project experience
Even where technologies have not changed, knowledge about typical task durations may not
be easily transferred from one project to another because of other differences between projects.
 ESTIMATIONS
Estimations are carried out at various stages of a software project for a variety of reasons
Where an Estimates Done?

 Strategic Planning
 Feasibility Study
 System Specification
 Evaluation of suppliers proposals
 Project Planning

Strategic Planning:
Project portfolio management involves estimating the costs and benefits of new applications in order
to allocate priorities. Such estimates may also influence the scale of development staff recruitment.
Feasibility Study:
This confirms that the benefits of the potential system will justify the costs.
System Specification
Most system development methodologies usefully distinguish between the definition of the users
requirement and the design which shows how those requirements are to be fulfilled. The effort needed
to implement different design proposals will need to be estimated. Estimates at the design stage will
also confirm that the feasibility study is still valid .
Evaluation of suppliers proposals
In the case of IOE annual maintenance contracts subsystem. For example, IOE might consider putting
development out to tender. Potential contractors would scrutinizer the system specification and
produce estimates as the basis of their bids. Amanda might still produce own estimates so that IOE
could question a proposal which seems too low in order to ensure that the proposer has properly
understood the requirements. The cost of bids could also be compared to in-house development.
Project Planning
As the planning and implementation of the project becomes more detailed, more estimates of smaller
work components will be made. These will confirm earlier broad-brush estimates, and will support
more detailed planning, especially staff allocations.

 PROBLEMS WITH OVER-AND UNDER ESTIMATES


A project leader such as Amanda will need to be aware that an over-estimate may cause the project to
take longer than it would otherwise. This can be explained by the application of two ‘laws’

 Parkinson’s law
 Brooks Law
Parkinson’s Law:
‘ Work expands to fill the time available ‘ , that is, given an easy target staff will work less hard.
Brooks Law:
The effort of implementing a project will go up disproportionately with the number of staff assigned
to the project. As the project team grows in size, so will the effort that has to go into management,
coordination and communication. This has given rise, in extreme cases, to the notion of Brooks’ Law:
‘Putting more people on a late job makes it later’. If there is an over estimate of the effort required,
this could lead to more staff being allocated than needed and managerial overheads being increased.
Some have suggested that while the under-estimated project might not be completed on time or to
cost, it might be implemented in a shorter time than a project with a more generous estimate.
An estimation is not really a prediction, it is a management goal. Barry Boehm has suggested that if a
software development cost is within 20% of the estimated cost for the job then a good manager can
turn into a self-fulfilling prophecy. A project leader like Amanda will work hard to make the actual
performance conform to the estimate.

 THE BASIS FOR SOFTWARE ESTIMATING


 The need of historical data
 Parameters to be estimated
 Measure of work
The need of historical data:
 Most estimated methods need information about past projects.
 However, care is needed when applying past performance to new projects because of
possible differences in factors such as programming languages and the experienced
staff.
 If past project data is lacking, externally maintained datasets of project performance
data can be accessed.
 One well-known international database is that maintained by the International
Software Benchmarking Standards Group(ISBSG), which currently contains data
from 4800 projects.
Parameters to be estimated:
 The project manager needs to estimate two project parameters for carrying out
project planning,
 These two parameters are effort and duration.
 Duration is usually measured in months.
 Work-month(wm) is a popular unit for effort measurement.
 The term person-month(pm) is also frequently used to mean the same as work-
month.
 One person-month is the effort an individual can typically put in a month.
 The person-month estimate implicitly takes into account the productivity losses that
normally occur due to time lost in holidays, weekly offs, coffee breaks, etc.,
 Person-month(pm) is considered to be an appropriate unit for measuring effort
compared to person-days or person-years because developers are typically assigned
to a project for a certain number of months.

Measure of work:
 Measure of work involved in completing a project is also called the size of the
project.
 Work itself can be characterized by cost in accomplishing the project and the time
over which is to be completed.
 Direct calculations of cost or time is difficult at the early stages of planning.
 The time taken to write the software may vary according to the competence or
experience of the software developers might not even have been identified.
 Implementation time may also vary depending on the extent to which
CASE(computer aided software engineering) tools are used during development.
 It is therefore a standard practice to first estimate the project size; and by using it,
the effort and the time taken to develop the software can be computed.
 Thus, we can consider project size as an independent variable and the effort or time
required to develop the software as dependent variable.
 Two metrics are popularly being used to measure size
I. Source Lines of Code(SLOC)
II. Function Point(FP)
 The SLOC measure suffers from various types of disadvantages, Which are to a
great extent corrected in the FP measure.
 However, the SLOC measure is intuitively simpler, so it is still being widely used.
 It is important , however, to be aware of the major shortcomings of the SLOC
measure
I. No precise Definition
II. Difficult to estimate at start of a project
III. Only a Code measure
IV. Programmer-dependent
V. Does not consider code complexity
No precise Definition: SLOC is very imprecise measure. Unfortunately , researchers have not been
consistent on points like does it include comment lines or are data declarations to be included?
Difficult to estimate at start of a project: From the project manager’s perspective , the biggest
shortcoming of the SLOC metric is that it is very difficult to estimate it during project planning stage,
and can be accurately computed only after the development of the software is complete. The SLOC
count can only be guessed at the beginning of a project , often leading to grossly inaccurate
estimations.
Only a Code Measure: SLOC is a measure of coding activity alone. A good problem size measure
should consider the effort required for carrying out of all the life cycle activities and not just coding.
Programmer-Dependent: SLOC gives a numerical value to the problem size that can vary widely
with the coding style of individual programmers. This aspect alone readers any LOC-based size and
effort estimations inaccurate.
Does not consider code complexity: Two software components with the same KLOC will not
necessarily take the same time to write, even if done by the same programmer in the same
environment. One component might be more complex. Because of this, the effort estimate based on
SLOC might have been made to find objective measures of complexity, but it depends to a large
extent on the subjective judgement of the estimator.
 SOFTWARE EFFORT ESTIMATION TECHNIQUES:
Barry Boehm, in his classic work on software effort models, identified the main ways of deriving
estimates of software development effort as:

 Algorithmic models : It use ‘effort drivers’ representing characteristics of the target


system and the implementation environment to predict effort.
 Expert Judgement: Based on the advice of knowledgeable staff.
 Analogy: Where a similar, completed, project is identified and its actual effort is used
as the basis of the estimate.
 Parkinson: Where the staff effort available to do a project becomes the estimate?
 Price to Win: Where the ‘estimate’ is a figure that seems sufficiently low to win a
contract.
 Top-Down: Where an overall estimate for the whole project is broken down into the
effort required for component tasks.
 Botton-up: Where component tasks are identified and seized and these individual
estimates are aggregated.

Clearly, the ‘Parkinson’ method is not really an effort prediction method, but a method of setting the
scope of a project.
Similarly, ‘price to win’ is a way of identifying a price and not a prediction.
Although Boehm rejects them as prediction techniques, they have value as management techniques.
For example, a perfectly acceptable engineering practice of ‘design to cost’.

 BOTTOM-UP ESTIMATING
While the bottom-up approach the estimator breaks the project into its component tasks. With a
large project, the process of breaking it down into tasks is iterative: each task is decomposed into
its component sub tasks and theses in turn could be further analysed. It is suggested that this is
repeated until getting the tasks as an individual could do in a week or two. Although this top-
down analysis is an essential precursor to bottom-up estimating, it is really a separate process that
of producing a work breakdown schedule(WBS).
The bottom-up part comes in adding up the calculated effort for each activity to get an overall
estimate. The bottom-up approach is the best at the later. More detailed, stages of project
planning. Whenever a project is completely novel or there is no historical data available , the
estimator would be forced to use the bottom-up approach.
A Procedural Code-Oriented Approach:
The bottom-up approach described above works at a level of activities. In software development a
major activity is writing code. Describing how bottom-up approach can be used at the level of
software components.
1. Envisage the number and type of software modules in the final system
2. Estimate the SLOC of each identified module
3. Estimate the work content, taking into account complexity and technical difficulty
4. Calculate the work-days effort

Envisage the number and type of software modules in the final system:
 Most information systems, for example, are built from a small set of system operations, e.g.,
Insert, Amend, Update, Display, Delete, Print.
 The same principle should equally apply to embedded systems, Albeit with a different set of
primitive functions.
Estimate the SLOC of each identified module:
 One way to judge the number of instructions likely to be in a program is to draw up a program
structure diagram and visualize how many instructions would be needed to implement
procedure. The estimator may look at existing programs which have a similar functional
description.
Estimate the work content, taking into account complexity and technical difficulty:
 The practice is to multiply the SLOC estimate by a factor for complexity and technical
difficulty.
 This factor will depend largely on the subjective judgement of the estimator.
 For example, the requirement to meet particular highly constrained performance targets can
greatly increase programming effort.
Calculate the work-days effort:
 Historical data can be used to provide ratios to convert weighted SLOC effort.

 THE TOP-DOWN APPROACH AND PARAMETRIC MODELS


The top-down approach is normally associated with parametric (or algorithmic ) models.
These may be explained using the analogy of estimating the cost of rebuilding a house. This is
a practical to house owners who need insurance cover to rebuild their property if destroyed.
Unless the house owner is in the building trade he or she is unlikely to be able to calculate the
numbers of bricklayer-hours. Carpenter-hours, electrician-hours and so on, required.
Insurance companies, however, produce convenient tables where the house-owner can find
estimates of rebuilding costs based on such parameters as the number of storeys and the floor
space of a house. This is a simple parametric model.

Project effort relates mainly to variables associated with characteristics of the final system. A
parametric model will normally have one or more formulae in the form

Effort-(system size) X (productivity rate)


For example, system size might be in the form ‘ thousands of lines of code ‘ (KLOC) and
have the specific value of 3 KLOC while the productivity rate was 40 days per KLOC. These
values will often be matters of judgement.

A model to forecast software development effort therefore as two key components.


 The first is a method of assessing the amount of work needed.
 The second assesses the rate of work at which the task can be done.
For example, Amanda at IOE may estimate that the first software module to be constructed is
2KLOC She may then judge that if Kate undertook the development of the code, with her expertise
she could work at a rate of 40days per KLOC per day and complete the work in 2*40 days, i.e.,
80days, while Ken, who is less experienced , would need 55 days per KLOC and take 2*55, i.e., 110
days to complete the task. In this case KLOC is a size driver indicating the amount of work to be
done, while developer experience is a productivity driver influencing the productivity or work rate.
The effort expanded on past projects(in work-days for instance) and also the system sizes in KLOC,
should be able to work out a productivity rate as

Productivity= effort/size
A more sophisticated way of doing this would be by suing the statistical technique least squares
regression to derive an equation in the form:
Effort=constant1+ (size * constant2)
Some parametric models . such as that implied by function points, are focused on system or task size,
while others, such are COCOMO , are more concerned with productivity factors. Having calculated
the overall effort required , the problem is then to allocate proportions of that effort to the various
activities within the project.
The top-down and bottom-up approaches are not mutually exclusive. Project managers will probably
try to get a number of different estimates from different people using different methods. Some parts of
an overall estimate could be derived using a top-down approach while other parts could be calculated
using a bottom-up method.

 EXPERT JUDGEMENT
 This is asking for an estimate of task effort from someone who is knowledgeable about either
the application or the development environment.
 This method is often used when estimating the effort needed to change an existing piece of
software,
 The estimator would have to examine the existing code in order to judge the proportion of
code affected and from that derive an estimate.
 Someone already familiar with the software would be in the best position.
 Some have suggested that expert judgement is simply a matter of guessing , but our own
research ahs shown that experts tend to use a combination of an informal analogy approach
where similar projects from the past are identified , supplemented by bottom-up estimating.
 There are many cases where the opinions of more than one expert may need to be combined.

 ESTIMATING BY ANALOGY
This is also called case-based reasoning. The estimator identifies completed projects (source
cases) with similar characteristics to the new project(the target cases). The effort recorded for the
matching source case is then used as a base estimate for the target. The estimator then identifies
differences between the target and the source and adjust the base estimate to produce an estimate
for the new project.
This can be a good approach where you have information about some previous projects nut not
enough to draw generalized conclusions about what might be useful drivers or typical
productivity rates.
A problem is identifying the similarities and differences between applications where you have a
large number of past projects to analyse. One attempt to automate this selection process is the
ANGEL software tool. This identifies the source case that is nearest the target by measuring the
Euclidean distance between cases. The Euclidean distance is calculated as:
Distance=square-root((target_parameter1-source_parameter1)2 +…
(target_parametern-source_parametern)2)
The above explanation is simply to give an idea of how Euclidean distance may be calculated.
The ANGEL package uses rather more sophisticated algorithms based on this principle.

 ALBRECHT FUNCTION POINT ANALYSIS


This is a top-down method that was devised by Allan Albrecht when he worked for IBM.
Albrecht was investigate programming productivity and needed to quantify the functional size
of programs independently of their programming languages. He developed the idea of
functional points(FPs).
The basis of function point analysis is that information systems comprise five major
components, or ‘external user types’ in Albrecht’s terminology, that are of benefit to the users.
 External input types are input transactions which update internal computer files.
 External output types are transactions where data is output to the user: Typically these would
be printed reports, as screen displays would tend to come under external inquiry types.
 External inquiry types note the US spelling of inquiry- are transactions initiated by the user
which provide information but do not update the internal files. The user inputs some
information that directs the system to the details required.
 Logical internal file types are the standing files used by the system. The term ‘file’ does not
sit easily with modern information systems. It refers to a group of data items that is usually
accessed together. It may be made up of one or more record types. For example, a purchase
order file may be made up of a record type PURCHASE-ORDER plus a second which is
repeated for each item ordered on the purchase order -PURCHASE-ORDER-ITEM. In
structured system analysis , a logical internal file would equate to a datastore, while record
types would equate to relational tables or entity types.
 External interface file types allows for output and input that may pass to and from other
computer applications. Examples of this would be the transmission of accounting data from
an order processing a magnetic or electronic medium to be passed to the Bankers Automated
Clearing System(BACS).Files shared between applications would also be counted here.
The analyst identifies each instance of each external user type in the application. Each component
is then classified as having either high, average or low complexity. The counts of each external
user type in each complexity band are multiplied by specified weights in the following Table 1 to
get FP scores which are summed to obtain an overall FP count which indicates the information
processing size.

External user type Multiplier


Low Average High
External input type 3 4 6
External output type 4 5 7
External inquiry type 3 4 6
Logical internal file type 7 10 15
External interface file type 5 7 10
Table 1: Albrecht complexity Multipliers
With FPs as originally defined by Albrecht, the question of whether the external user type was of
high , low or average complexity was intuitive. The International FP User Group(IFPUG) has now
promulgated rules on how this is assessed. For example, in the case of logical internal files and
external interface files, the boundaries shown in the Table 2 are used to decide the complexity level.
Similar tables exists for external inputs and outputs.

Number of record types Number of data types


<20 20-50 >50
1 Low Low Average
2 to 5 Low Average High
>5 Average High High
Table 2:IPFUG File type Complexity
Function point analysis recognizes that the effort required to implement a computer-based information
system relates not just to the number and complexity of the features provided but also to the
operational environment.
Fourteen factors have been identified which can influence the degree of difficulty associated with
implementing system. The list that Albrecht produced related particularly to the concerns of
information system developers in the late 1970s and early 1980s. Some technology which was then
new and relatively threatening is now well established.
 FUNCTION POINTS MARK II
 The ‘Mark II’ label implies an improvement and replacement of the Albrecht method.
 The Albrecht(IFPUG) method, however has had many refinements made to it and
FPA Mark II remains a minority method used mainly in the United Kingdom.
 With Albrecht, the information processing size is initially measured in unadjusted
function points(UFPs) to which a technical complexity adjustment can then be
applied to Technical Complexity Adjustment(TCA).
 The assumption is that an information system comprises transaction which have the
basic structure shown in the following figure 1:Model of Transaction

Data Store

From User Input Process


Input Output Return to User
For each transaction the UFPs are calculated:
Wi * (Number of input data element types) +
We * (Number of entity types referenced) +
Wo * (Number of output data elements types)
Wi , We and Wo are weightings derived by asking developers the proportions of effort
spent in previous projects developing the code dealing respectively with inputs,
accessing and modifying stored data and processing outputs.
The proportions of effort are then normalized into ratios, or weightings, which add
upto 2.5. This process for calculating weightings is time consuming and most FP
counters use industry averages which are currently 0.58 for Wi 1.66 for We and 0.26
for Wo.
Mark II FPs follow the Albrecht method in recognizing that one system delivering the
same functionality as another may be more difficult to implement(but also more
valuable to the users) because of additional technical requirements. For example, the
incorporation of additional security measures would increase the amount of effort to
deliver the system. The identification of further factors to suit local circumstances is
encouraged.

 COSMIC FULL FUNCTION POINTS


 While approaches like that of IFPUG are suitable for information systems, they are
not helpful when it comes to sizing real-time or embedded applications.
 This has resulted in the development of another version of function points- the
COSMIC FFP method.
 COSMIC deals with this by decomposing the system architecture into a hierarchy of
software layers.
 The software component to be seized can receive requests for services from layers
above and can request services.
 At the same time there could be separate software components at the same level that
engage in peer-to-peer communication.
 This identifies the boundary of the software component to be assessed and thus the
points at which it receives inputs and transmits outputs.
 Inputs and Outputs are aggregated into data groups, where each group brings together
data items that relate to the same object of interest.
 Data groups can be moved about in four ways:
 entries(E): which are effected by subprocesses that move the data group into the software
component in question from a ‘user’ outside its boundary – this could be from another layer
or another separate software component in the same layer via peer-to-peer communication;
 exists(X): which are effected by subprocess that move the data group from the software
component to a ‘user’ outside is boundary;
 reads(R): which are data movements that move data groups from persistent storage(such as
database) into the software component.
 writes(W): which are data movements that transfer data groups from the software component
into persistent storage.

 The overall FFP count is derived by simply adding up the counts for each of the four
types of data movement. The resulting units are Cfsu(COSMIC functional size units).
 COCOMO II : A PARAMETRIC PRODUCTIVITY MODEL
Boehm’s COCOMO (Constructive Cost Model) is often referred to in the literature on software
project management, particularly in connection with software estimating. The term COCOMO
really refers to a group of models.
Boehm originally based his models in the late 1970s on a study of 63 projects. Of these only 7
were business systems and so the models could be used with applications other than information
systems. The basic model was built around the equation.
(effort)=c(size)k
Where effort was measured in pm or the number of ‘person-months’ consisting of units of 152
working hours, size was measured in kdsi, thousands of delivered source code instructions, and c
and k were constants.

 Organic mode:
This would typically be the case when relatively small teams developed software in a
highly familiar in-house environment and when the system being developed was
small and the interface requirements were flexible.
 Embedded mode:
This meant that the product being developed had to operate within very tight
constraints and changes to the system were very costly.
 Semi-detached mode:
This combined elements of the organic and the embedded modes or had
characteristics that came between the two.
COCOMO II has been designed to accommodate by having models at three different stages:

 Application composition:
The external features of the system that the user will experience are designed.
Prototyping will typically be employed to do this. With small applications that can be
built using high-productivity application-building tools, development can stop at this
point.
 Early design:
The fundamental software structures are designed. With larger, more demanding
systems, where, for example, there will be large volumes of transactions and
performance is important, careful attention will need to be paid to the architecture to
be adopted.
 Post Architecture:
The software structures undergo final construction, modification and tuning to create
a system that will perform as required.
To estimate the effort for application composition, the counting of object points is recommended by
the developers of COCOMO II.
At the early design stage, FPs are recommended as the way of gauging a basic system size. An FP
count may be converted to an LOC equivalent by multiplying the FPs by a factor for the programming
language that is to be used.
The following model can then be used to calculate an estimate of person-months.
Pm=A(size)(sf) X (em1) X (em2) X ……X (emn)
Where pm is the effort in ‘person-month’, A is a constant , size is measured in kdsi and sf is exponent
scale factor.
The scale factor is derived thus:

The fact that these factors are used to calculate an exponent implies that tha lack of these qualities
increase the effort required disproportionately more on larger projects.

 Precedentedness(PREC):
This quality is the degree to which there are precedents or similar past cases for the
current project. The greater the novelty of the new system , the more uncertainty there
is and the higher the value given to the exponent driver.

 Development Flexibility(FLEX):
This reflects the number of different ways there are of meeting the requirements. The
less flexibility there is, the higher the value of the exponent driver.
 Architecture/Risk Resolution:
This reflects the degree of uncertainty about the requirements. If they are liable to
change then a high value would be given to this exponent driver.
 Team Cohesion(TEAM):
This reflects the degree to which there is a large dispersed team as opposed to there
being a small tightly knit team
 Process Maturity(PMAT):
On software quality explains the process maturity model. The more structured and
organized the way the software is produced , the lower the uncertainty and the lower
the rating will be for this exponent driver.
In the COCOMO II model the effort multipliers(em) adjust the estimate to take account of
productivity factors, but do not involve economies or diseconomies of scale. The multipliers
relevant to early design are in the above table and those used at the post architecture stage in
the below table .
At a later stage of the project , detailed design of the application will have been completed.
There will be a clearer idea of application size in terms of lines of code, and the factor
influencing productivity will be better known. A revised estimate of effort can be produced
based on the broader range of effort modifiers seen in above table. The method of calculation
is the same as for early design.

You might also like