Title: On the process of software cost estimation
1On the process of software cost estimation
- By Dr. Lawrence Y.L.Wong
- Dept. of Information Communications Technology
- Institute of Vocational Education (Tsing Yi)
2Contents
- Overview of problems associated with vendor
selection and tendering - Review of cost estimation methodology
- outline of proposed method
- implementation and case studies of proposed
method - QA
3Problems associated with vendor selection
- Scenario a small embedded system s/w project to
be tender to improve the performance of ECS in
order to satisfy the requirements of ISO14000 - Following the normal procedure of the
organization, a total of 9 contractors had
responded and return with proposals with bidding
summarized in the following table - Assume we concentrate on the bidding only...
4Which contractor will you select?
5Which contractor will you select?
- There is big difference in the bidding
- from 10k to 500k (50 times difference!!)
- bidding range no. of vendors
- 10k - 49k 4
- 50k - 99k 3
- 100k and above 2
6Which contractor will you select?
- The more information, the better
- qualitative information company background,
track record - quantitative information a set data that can
provide better foundation to facilitate
decision!! - Effort (in man-month) is a quantitative figure
that commonly being used
7Do you find it easier to select?
- Now, with the following information (with effort
and bidding
8Problems and opportunities
- Vendor does not normally supply with such
parameter - Even they do it is only an estimation
- how good is their estimation??
- there are many metrics and models such as GPP, FP
and COCOMO - There is no such unify model in cost estimation
- Alternative
- you can help the vendors to compute such figure
- Adv common model
- the more raw data the more accurate and better
control that you can estimate
9Whats next .. Part 2
- Overview of problems associated with vendor
selection and tendering - Review of cost estimation methodology
- Outline of proposed method
- Implementation and case studies of proposed
method - QA
10Introduction to costing
- Budgeting is an integral part of software project
management - Two types of costs elements
- Internal cost
- salaries of the development teams, managers and
support staff involved in the project - hardware and software cost for developing the
product - cost of OH such as rent, utilities, salaries of
senior management - External cost
- Profit margin
- Prevention cost
- Premium for failure cost
11Simplified model in cost estimation
Maturity level
Profit margin
of programmer and duration
Nominal effort (man-month)
Vendor s capability model
Vendor s maturity model
Parameters
S/w cost estimation model
Salary level
Overhead percentage
of re-usable code code
Other parameters not stated in cost drivers
12Metrics for cost estimation
- In modeling, there is a need to have some sort of
metrics - number of lines of code (LOC) or KLOC
- thousand delivered source instructions (KDSI)
- the number of operands and operators in the
software products - token count
13Problems associated metrics
- source code is only a small part of the total
software development - there are big difference in size when using
different development language - it is subjective to count number of lines (i.e.
such as comment) - some code in the project are reuseable
- not all code are being delivered to clients. Many
code are known as tools - it is only an estimate. The actual figure can
only be used when the product is completely
finished
14Alternative metrics for cost estimation
- FFP (acronym for files, flows, and processes)
- proposed by van der Poel and Schach, 1983
- Function point
- proposed by Albrech 1979
15FFP
- acronym for files, flows, and processes
- suitable for medium size project in data
processing - with scale from 2 man-year to 10 man-year
- S Fi Fl Pr
- C b ? S
- S size, C cost
- Fi number of files, Fl number of flows
- Pr number of processes
- b efficiency constant
- Adv cost can be estimated during the planning
phase
16Function point
- Based on the number of input and output items
- each items with a level of complexity
- UFP 4?Inp 5? Out 4?Inq 10?Maf 7?Inf
- where
- UFP unadjusted Function Point
- Inp Input items (simple3, average4,
complex6) - Out output items (simple4, average5,
complex7) - Inq inquiry (simple3, average4, complex6)
- Maf master file (simple7, average10,
complex15) - Inf Interface (simple5, average7, complex10)
17Function point
- Step 1 Classify each of the components of a
product and each is assigned function point - Step 2 Technical Complexity factor (TCF) is
computed - a measure of the effect of 14 technical factors
(e.g. reusability, ease of use, maintainability) - Each of these 14 factors is assigned a value from
0 (no influence) to 5 (strong influence
throughout) - TCF 0.65 0.01 ? DI
- DI total degree of influence (14 factors each
assigned a value from 0 to 5, min0, max70) - TCF technical complexity factor (0.65 to 1.35)
- Step 3 FP UFP ? TCF
18Empirical estimation model
- Three classes of models
- static single-variable models
- resource c1 ? (estimated characteristic ) c2
- typical example COCOMO
- static multivariable model
- resource c11e1 c21 e2
- ei is the ith software characteristic (variables)
- ci1, ci2 are empirically constant
- dynamic multivariable models
- projects resource requirements a function of
time. - Based on resource expenditure curve,
Rayleigh-Norden curve, typical example Putnam
Estimation model
19COCOMO - Constructive Cost Model
- Three different levels as defined by Boehms
hierarchy - Basic static single-value model that computes
software development effort (and cost) as a
function of program size expressed in estimated
LOCs - Intermediate compute software development effort
as a function of program size and a set of cost
drivers that include subjective assessments of
product, hardware, personnel, and project
attributes. - Advanced incorporates all characteristics of the
intermediate version with an assessment of the
cost drivers impact on each step (analysis,
design) of the s/w engineering process
20Features of COCOMO
- Three different levels of complexity
- organic (simple and straightforward)
- semidetached (medium size)
- embedded (complex)
21Basic COCOMO
- For Basic COCOMO, it has the following formulas
- Organic
- Nominal effort 2.4 ? (KDSI) 1.05 person-months
- Semidetached
- Nominal effort 3.0 ? (KDSI) 1.12 person-months
- Embedded
- Nominal effort 3.6 ? (KDSI) 1.20 person-months
22Intermediate COCOMO
- For Intermediate COCOMO, it has the following
formulas - Organic
- Nominal effort 3.2 ? (KDSI) 1.05 person-months
- Semidetached
- Nominal effort 3.0 ? (KDSI) 1.12 person-months
- Embedded
- Nominal effort 2.8 ? (KDSI) 1.20 person-months
23Intermediate COCOMO
- done in 2 stages
- Stage 1 estimate the development effort (i.e.
KDSI and mode of development) - Step 2 Multiply the nominal effort with 15
software development multipliers. - Effort EAF ? Nominal effort
- where EAF effort adjustment factor
- ? cost_driveri
24Intermediate COCOMO
- A set of 15 cost driver is derived from
subjective attribute according to the following
table
25Case studies (as stated by Bohem84)
- Simple project with input parameters as 12 KDSI
- Complex project with input parameter as 10 KDSI
- Which one has higher effort?
- Ans
- case for simple project Organic, EAF1.0
- effort 1.0 x 3.2 x (12)1.05 43 man-month
- case for complex project Embedded, EAF1.347
(see the following table) - effort 1.347 x 2.8 x (10) 1.20 59 man-months
26Case studies (as stated by Bohem84)
27Pros and cons in COCOMO
- The accuracy of estimation software
characteristicis very important - common used metrics
- number of lines of code (KLOC)
- number of delivered source instructions (KDSI)
- It is possible to estimate KLOC or KDSI from
function point - If this estimate is incorrect, then the
prediction from the model may be incorrect - A number of tools are available that automate
COCOMO that help to speed up computation when the
value of a parameter is modified (in Excel)
28How good is COCOMO
- Bohem evaluated a set of 63 TRW projects to
validate the model - with KLOC as the input and final results (i.e.
how many man-month) - three figure-of-merit are evaluated
- mean regression error, MRE (0.1868)
- predictive error (0.7778)
- correlation (0.9946)
29A schematic to evaluate the error of the
estimation model
Actual result (after the completion of project)
15 Cost Drivers
error
Nominal effort (man-month)
Project 1
( )2
COCOMO
S/W metrics (e.g. KDSI)
Actual result (after the completion of project)
Error square
15 Cost Drivers
error
Nominal effort (man-month)
( )2
Project N
COCOMO
S/W metrics (e.g. KDSI)
30Some comments in using COCOMO
- FP is most suitable for data processing
applications while COCOMO can cover various
project scope and complexity - A study with sample of 63 projects (with
different field) indicate COCOMO has an high
predictability (with accuracy within 20
predicted values,70 of the time) - CASE tools available (written in Excel, COCOMO
II) that can help to speed up the calculation - attempt to improve accuracy is not useful as
there is some inaccuracy with the raw data such
as KDSI and other multiplier. (i.e. it is still
required to predict the KDSI)
31Some comments in using COCOMO
- During mid-80
- COCOMO was the cutting edge of cost estimation
research that accommodate for various
applications - No other technique is consistently as accurate as
COCOMO - There are big effect of project uncertainties on
the accuracy of software size and cost estimates - the earlier the SLC, the worse
- Feasibility stage a factor of 4 (I.e. 25 to
400 ) - Planning stage a factor of 2 (I.e. 50 to 200)
- Product design stage a factor of 1.5 (I.e. 66
to 150) - Detail design specification stage a factor of
1.25 (80 to 125)
32Whats next .. Part 3
- Overview of problem associated with vendor
selection and tending - Review of cost estimation methodology
- Outline of a proposed method
- Implementation and case studies of proposed
method - QA
33Outlines of the proposed method
- request vendors to supply with a set of upstream
data - for instance, the 15 cost drivers and software
metric such as LOC or KDSI - The organization can use a pre-determined model
(e.g. COCOMO Intermediate) to help to estimate
effort of the project - this is an important parameter can be used to
benchmark the vendors maturity and capacity - A database is established and it can be used to
benchmark the project itself
34Outlines of the proposed method
- In order to help the vendors to return a set of
consistent upstream data - A questionnaire can be used to prompt the vendor
to specify the information in the early stage
such as RFP - see the following slide...
35(No Transcript)
36Outlines of the proposed method
- The most difficult of all
- vendor need to estimate a software metric
- KDSI or KLOC
- It can be done by estimating from UFP, then
convert to KLOC or KDSI
37Whats next .. Part 4
- Overview of problem associated with vendor
selection and tending - Review of cost estimation methodology
- Outline of a proposed method
- Implementation and case studies of proposed
method - QA
38Scenario
- Company A
- A multinational company (fortune 500) specialied
in the field. It is prepare to set up a joint
venture company locally regardless the result of
the bidding. - Company B
- A multinational company with good engineering
background prepare to open the market in the
region. The company rely on traditional
technology and practice.The bidding is set in a
low value hoping to get good reference. A local
office will be set up when the bidding is
successful
39Scenario
- Company C
- A local company with little knowledge about the
field. However, headed by a young entrepreneur
wanted to engage in hi-tech business, the company
has a plan to recruit a lot of engineering staff
when the bidding is successful. - Company D
- A multinational company with good reputaton in
the trade for over-pricing. However, it has been
well established locally for more than 20 years
with many job referrence. The bidding level is
set at very high end to reflect it is the leader
in the field.
40Shift of focus
- In estimation theory the more data, the better
!! - Raw data is better than synthesized data (i.e.
upstream data) - estimated effort, bidding , profit margin,
salary level, no. of programmer - there are still many problems to collect these
information - since we know the cost-model has many uncertainty
- inaccuracy due to subjective preference in
cost-driver - inaccuracy due to s/w metrics estimation
- inaccuracy due to the model itself
41What to compare?
- estimation model is not advocated at this stage
to do some useful work - Shift the focus from estimation to comparing
- collect more information for future exercise in
modeling - Some sort of rule is required to facilitate the
selection - a pragmatic approach is proposed
- to allow vendors to compare with their peers
- how bidding, cost-driver sand software metrics
(i.e. KDSI)
42A simple rule
- Cost driver derived from a subjective score
return from vendors - Based on the Bohems cost-driver model mapping,
the subjective is converted into a figure of
merit - Basically, it is a measure of vendors
understanding of their own strength and the
project. - the higher the cost driver, the more it will cost
- who will return a low figure
- company with good experience, methodology
- deliberate or undeliberate underestimation,
misunderstand the problem/project
43Next Steps...
- Company A a top multinational company wanting to
set up a joint venture locally - EAF1.1 KDSI 33 Bidding1 M
- Company B a multinational company rely on
traditional technology - EAF1.2 KDSI 100 Bidding1 M
- Company C a very aggressive local company
- EAF 0.7 KDSI 12 Bidding0.5 M
- Company D a big-brother on the trade.
- EAF 1.5 KDSI 30 Bidding5 M
44A tentative rule
- a point score system is setup
- one point is award when EAF is comparable with
the average value (i.e. average value collected
from the vendors proposal) - one point is award when software metric is
comparable with the industrial norm (i.e. average
value collected from the vendors proposal) - two point is award for lowest bidding range, one
point for bidding at middle range and zero point
for highest range
45A snapshot of all possible cases
46Back to our case
- Company A seems to have good favour
47Back to our case
- Now using the COCOMO to work out the estimated
effort - Company A seems to have good favour
48Conclusions
- Cost estimation is still a research topic
- there is still many uncertainty
- Shift the focus from cost estimation to
comparing - based on the input from various vendor,
cost-estimation model is used to generate a
figure that is stored in the database for future
use
49Summary
- Overview of problem associated with vendor
selection and tendering - Review of cost estimation methodology
- metrics FFP, FP
- modeling COCOMO (3 hierarchy, 3 levels of
complexity) - outline of proposed method
- shift from estimation to comparing
- point scale is used, when input is comparable
to the norm of the industry, point is award - implementation and case studies of proposed method
50Thank you!!
51QA