Natural Evolution of Measures - PowerPoint PPT Presentation

1 / 51
About This Presentation
Title:

Natural Evolution of Measures

Description:

Movies. Music, playing guitar. Heavy metal music. Drinking. Off road driving ... Nullchar = Put (Console, 'null terminator'); end. Main. 45. Problems with LOC ... – PowerPoint PPT presentation

Number of Views:69
Avg rating:3.0/5.0
Slides: 52
Provided by: disc8
Category:

less

Transcript and Presenter's Notes

Title: Natural Evolution of Measures


1
When you can measure what you are speaking
about, and express it in numbers, you know
something about it but when you cannot measure
it, when you cannot express it in numbers, your
knowledge is of a meagre kind. Lord Kelvin
2
Measuring lateness
  • To measure time on an absolute scale we can
    measure the elapsed minutes since the start of
    the lecture.
  • If a student arrives after more than 5 minutes
    have elapsed then their picture is taken and
    added to the sides
  • The next few slides are late comers from this
    lecture.

3
Measurement
  • Thanks to Professor Norman Fenton of Queen Mary
    and Westfield College for permission to use some
    of his material in this weeks part of the
    course.
  • Professor Fenton is an internationally leading
    expert in Software Measurement.

4
Exam hint
  • To get the exam hint you have to be on time for
    the start of the lecture.

5
Results from Survey
  • Which computer scientist
  • Alan Turing
  • Tim Berners Lee
  • Bill gates
  • Albert Einstein ?

6
Results from Survey
  • Largest program
  • 75K Loc (Financial system)
  • 20K Loc VB
  • 30K Loc (POS system in C)
  • 15K Loc (peer-peer file sharing)
  • 6K Loc (scheduling)
  • 6K Loc in 30 classes (Distributed network)
  • 4K Loc
  • 3K Loc java database
  • 72 pages of java
  • 1-2K loc in C (systems simulator)
  • 1K Loc (clinical data base)
  • 400-500
  • 50 classes in java
  • 40-50 Loc

7
Results from Survey
  • After you finish your Degree what will you do
  • Get a Job
  • Do an MSc
  • Start a PhD
  • Get very drunk

8
Results from Survey
  • Your other interests
  • Football, Cricket, Tennis, Swimming, Golf,
    Cycling, mountaineering, running
  • Karate/Kickboxing
  • Dance
  • Politics
  • Scuba diving
  • Cooking
  • Travel
  • Movies
  • Music, playing guitar
  • Heavy metal music
  • Drinking
  • Off road driving
  • Chess, board games
  • Reading

9
OK time for measurement
  • Some caution about measurement
  • Examples of software measurement in practice
  • A little Measurement theory
  • Control flow graphs
  • Calculating software metrics

10
Definition of Measurement(Fenton)
Measurement is the process of empirical objective
assignment of numbers to entities, in order to
characterise a specific attribute.
  • Entity an object or event
  • Attribute a feature or property of an entity
  • Objective the measurement process must be
    based on a well-defined rule whose results
    are repeatable

11
Example Measures
12
Avoiding Mistakes in Measurement
  • Common mistakes in software measurement can be
    avoided simply by adhering to the definition of
    measurement. In particular
  • You must specify both entity and attribute
  • The entity must be defined precisely
  • You must have a reasonable intuitive
    understanding of the attribute before you propose
    a measure
  • The theory of measurement formalises these ideas

13
Dont be Wildes Cynic
  • Estimation based on measurement is often used to
    work out project effort
  • Do not confuse this with the price you should
    charge to the customer
  • There is a difference between cost and value
  • A cynic is one who knows the cost of everything
    and the value of nothing Oscar Wilde

14
Cost and Value
  • Build cost The cost to you
  • This can be assessed by estimation
  • .. Based on measurements
  • Price The value to the customer is different
  • Price (p) has to reflect value not build cost (b)
  • This can be a (huge) advantage to you pgtgtb
  • or a warning pltb

15
Example Use of Measurement
  • Suppose you set up a maintenance project
  • You goal is to ensure nothing goes wrong
  • How do you know that you are saving money?
  • How do you fight for more resources for your
    project?
  • How do you avoid being the first to be chopped or
    outsourced?

16
Be Clear of Your Attribute
  • It is a mistake to propose a measure if there
    is no consensus on what attribute it
    characterises.
  • Results of an IQ test
  • intelligence?
  • or verbal ability?
  • or problem solving skills?
  • defects found / KLOC
  • quality of code?
  • quality of testing?

17
A Cautionary Note
  • We must not re-define an attribute to fit in with
    an existing measure.

18
Types and uses of measurement
  • Two distinct types of measurement
  • direct measurement
  • indirect measurement
  • Two distinct uses of measurement
  • for assessment
  • for prediction

19
Some Direct Software Measures
  • Length of source code (measured by LOC)
  • Duration of testing process (measured by elapsed
    time in hours)
  • Number of defects discovered during the testing
    process (measured by counting defects)
  • Effort of a programmer on a project (measured by
    person months worked)

20
Some Indirect Software Measures
LOC produced person months of effort
Programmer productivity
number of defects module size
Module defect density Defect detection efficiency
Requirements stability Test effectiveness
ratio System spoilage
number of defects detected total number of defects
numb of initial requirements total number of
requirements
number of items covered total number of items
effort spent fixing faults total project effort
21
Measurement Theory Objectives
  • Measurement theory is the scientific basis for
    all types of measurement. It is used to determine
    formally
  • When we have really defined a measure
  • Which statements involving measurement are
    meaningful
  • What the appropriate scale type is
  • What types of statistical operations can be
    applied to measurement data

22
Measurement Theory Key Components
  • Empirical relation system
  • the relations which are observed on entities in
    the real world which characterise our
    understanding of the attribute in question,
    e.g. Fred taller than Joe (for height of
    people)
  • Representation condition
  • real world entities are mapped to number (the
    measurement mapping) in such a way that all
    empirical relations are preserved in numerical
    relations and no new relations are created e.g.
    M(Fred) gt M(Joe) precisely when Fred is taller
    than Joe

23
Representation Condition
Real World
Number System
M
Joe
Fred
63
72
Joe taller than Fred
M(Joe) gt M(Fred)
Empirical relation
Numerical relation
preserved under M as
24
Meaningfulness in Measurement
  • Some statements involving measurement appear more
    meaningful than others
  • Fred is twice as tall as Jane
  • The temperature in Tokyo today is twice that in
    London
  • The difference in temperature between Tokyo and
    London today is twice what it was yesterday

Formally a statement involving measurement
is meaningful if its truth value is invariant
of transformations of allowable scales
25
Measurement Scale Types
  • Some measures seem to be of a different type to
    others, depending on what kind of statements are
    meaningful. The 5 most important scale types of
    measurement are
  • Nominal
  • Ordinal
  • Interval
  • Ratio
  • Absolute

Increasing order of sophistication
26
Scales
  • Nominal scale (classification)
  • eg. blood groups, program language, colour
  • Ordinal scale (ordering)
  • eg. Excellent, Very Good, Good, Satisfactory
  • Interval Scale (quantifying differences)
  • eg. Date
  • Ratio scale (ratios and zero are meaningful)
  • eg. Length
  • Absolute scale (counting)

27
Nominal Scales Egs.
  • A classification
  • No Order, No Size
  • Blood Groups O, A , AB , AB-
  • Multiple choice answers
  • Which of the following is makes Cornflakes the
    best for you each morning?
  • A. High in Vitamins
  • B. Delicious Hot or Cold
  • C. Full of Natural Sunshine
  • Note ordering of letters is not important.
  • What programming language is this program written
    in?
  • What is the project name to which piece of code
    belongs?

28
Ordinal Scales
  • as above plus order meaningful
  • No size, no comparison of differences
  • This country would be better off without the
    monarchy.
  • Do you
  • A. Strongly Agree
  • B. Agree
  • C. Neither Agree nor disagree
  • D. Disagree
  • E. Strongly disagree

This program is more cohesive than that one
29
Ordinal Scales (II)
  • A-level results
  • A, B, C, D, E
  • Order important, but can you compare difference
    A-B with B-C?
  • Assigning points to grades and totalling is
    trying to do this. A 12, B 9, C5 etc.
  • This is a common mistake in use of measurement in
    computing (as elsewhere)

30
Interval Scale Measurement
  • Powerful, but rare in practice
  • Distances between entities matter, but not ratios
  • Mapping must preserve order and intervals
  • Examples
  • Timing of events occurrence, e.g. could measure
    these in units of years, days, hours etc, all
    relative to different fixed events. Thus it is
    meaningless to say Project X started twice as
    early as project Y, but meaningful to say the
    time between project X starting and now is twice
    the time between project Y starting and now
  • Heat measured on Fahrenheit or Centigrade scale

31
Ratio Scales
  • As above plus zero meaningful ratios meaningful
  • length, mass,
  • temperature (but in Kelvin not centigrade)
  • price Two for the price of one.

32
Absolute Scales
  • Used for counting
  • Number of students in class
  • Number of lines of code
  • Number of faults
  • Number of programmers
  • Number of

33
Scale Types Summary
Scale Types
Characteristics
Nominal Ordinal Interval Ratio Absolute
Entities are classified. No arithmetic
meaningful. Entities are classified and ordered.
Cannot use or -. Entities classified, ordered,
and differences between them understood
(units). No zero, but can use ordinary
arithmetic on intervals. Zeros, units, ratios
between entities. All arithmetic. Counting only
one possible measure. All arithmetic.
34
Natural Evolution of Measures
  • As our understanding of an attribute grows, it is
    possible to define more sophisticated measures
    e.g. temperature
  • 200BC - rankings, hotter than
  • 1600 - first thermometer preserving hotter
    than
  • 1720 - Fahrenheit scale
  • 1742 - Centigrade scale
  • 1854 - Absolute zero, Kelvin scale

35
Example The Mean
  • Suppose we have a set of values a1,a2,...,an
    and wish to compute the average
  • The mean is
  • The mean is not a meaningful average for a set of
    ordinal scale data

36
Alternative Measures of Average
Median The midpoint of the data when it
is arranged in increasing order. It divides the
data into two equal parts
Suitable for ordinal data. Not suitable for
nominal data since it relies on order having
meaning.
Mode The commonest value
Suitable for nominal data
37
Summary of Meaningful Statistics
Scale Type
Average
Spread
Nominal Ordinal Interval Ratio Absolute
Mode Median Arithmetic mean Geometric mean Any
Frequency Percentile Standard deviation Coefficien
t of variation Any
38
Arithmetic mean?
Geometic mean?
Standard deviation?
Definition The coefficient of variation is an
attribute of a distribution its standard
deviation divided by its mean.
39
Permissible changes of Units
  • Nominal scale
  • Any 1-1 mapping from M to M (a renaming)
  • M(a) M(b) iff M(a)M(b)
  • Ordinal scale
  • Any monotonic mapping from M to M
  • M(a) gt M(b) if M(a) gt M(b)

40
Permissible changes of Units (II)
  • Interval Scale
  • Any affine transformation
  • M(v) a.M(v) b (agt0)
  • Ratio scale
  • Any linear transformation
  • M(v) a.M(v) (agt0)
  • Absolute scale
  • No transformations

41
Measuring Software
  • 3 Dimensions
  • Length
  • Functionality
  • Complexity
  • Industrial Practise
  • lines for code
  • pages for specifications

42
Horses for Courses
  • usefulness of a measure depends on the purpose
    for which it is used!
  • Roughly speaking
  • Length and complexity are most useful as a
    measure of cost of code
  • Functionality and complexity are most useful as a
    measure of cost of (implementing) a spec.

43
The LOC Measure is Widely used
  • LOC Number of Lines Of Code
  • The simplest and most widely used measure of
    program size. Easy to compute and automate
  • Used (as normalising measure) for
  • effort/cost estimation (Effort f(LOC))
  • quality assessment/estimation (defects/LOC))
  • productivity assessment (LOC/effort)
  • Alternative (similar) measures
  • KLOC Thousands of Lines Of Code
  • KDSI Thousands of Delivered Source Instructions
  • NCLOC Non-Comment Lines of Code
  • Number of Characters or Number of Bytes

44
how many LOC here?
with
TEXT_IO
use
TEXT_IO
procedure
Main
is
--This program copies characters from an input
--file to an output file. Termination occurs
--either when all characters are copied or
--when a NULL character is input
Nullchar, Eof
exception

Char CHA
RACTER
Input_file, Output_file, Console FILE_TYPE
Begin
loop
Open (FILE gt Input_file, MODE gt IN_FILE,
NAME gt CharsIn)
Open (FILE gt Output_file, MODE gtOUT_FILE,
NAME gt CharOut)
Get (Input_file, Char)
if
END_OF_FILE (Input_file)
then
raise
Eof
elseif
Char ASCII.NUL
then
raise Nullchar
else
Put(Output_file, Char)
end

if

end loop


exception
when
Eof gt Put (Console, no null characters)
when
Nullchar gt Put (Console, null terminator)
end
Main
45
Problems with LOC
  • No standard definition
  • Measures length of programs rather than size
  • Wrongly used as a surrogate for
  • effort
  • complexity
  • functionality
  • Fails to take account of redundancy and reuse
  • Cannot be used comparatively for different types
    of programming languages
  • Only available at the end of the development
    life-cycle

46
Problems with What counts
  • Do declarations count?
  • Do comments count?
  • Do import lists count?
  • Do compiler directives count?
  • What about side effects?
  • if (x, y--, zk0, j-- 0)
  • Blank lines? Procedure headings, debugging code,
    keywords on their own?

47
Structural measures
  • Can define a measure on code by its structure
  • control flow structure
  • data flow structure
  • data structure
  • These can then be the basis of measures of
  • length
  • complexity
  • test coverage
  • ...

48
Control Flow Structure
  • We will consider a (standard) simple language of
  • Atomic statements (assignment) (A)
  • conditional (C)
  • loops (L)
  • sequencing (S)

49
Control Flowgraphs
Loop
Atomic
Sequence
Conditional
Start or Stop node
50
Four Types of Node
  • Start Node
  • In degree 0
  • Stop Node
  • Out degree 0
  • Procedure nodes
  • Out degree 1
  • Predicate Nodes
  • Out Degree 2

51
Nesting
If P then A else (while P do A) A
  • Replace a procedure node and its edge by a
    flowgraph

1
2
3
4
52
Structural Measures
  • Define measure in terms of four functions
  • A nullary function for atomic statements
  • M(A) FA
  • Two binary function for conds and seqs
  • M(C(P1,P2))
    FC(M(P1),M(P2))
  • M(S(P1,P2))
    Fs(M(P1),M(P2))
  • A unary function for loops
  • M(L(P1)) FL(M(P1))

53
Example Hierarchical Measures
  • Number of Nodes
  • FA() 2
  • Fs(m1,m2) m1 m2 - 1
  • FC(m1,m2) m1 m2
  • FL(m1) m1 1

54
Example Hierarchical Measures
  • Number of Edges
  • FA() 1
  • Fs(m1,m2) m1 m2
  • FC(m1,m2) m1 m2 2
  • FL(m1) m1 2

55
Example Hierarchical Measures
  • Number of Assignments
  • FA() 1
  • Fs(m1,m2) m1 m2
  • FC(m1,m2) m1 m2
  • FL(m1) m1

56
Tutorial 1, Q1
  • State the five scales used in measurement and
    describe the most important aspects which
    distinguish them. For each of the following
    statements, explain whether or not it is
    meaningful and if the statement is meaningless
    say whether there is a quick fix" which could
    make the statement more meaningful.
  • (a) 100C is the boiling point of water. (b)
    Today is twice as hot as yesterday. (c) The FT
    index fell 325 points today. (d) The program is
    50 lines of code long. (e) The program took 3
    months to write. (f) The testing on this project
    took twice as long as the programming. (g) The
    cost of maintaining program B is twice that of
    maintaining program A. (h) Program A is more
    complex than program B.

57
Tutorial 1 Q2
  • Give five criticisms of lines of code as a
    software measurement of size. That is, give five
    examples of parts of code which may (or may not)
    be counted as a line of code.

58
Tutorial 1, Q3
  • Give the CFG for the following program schematic.
  • Calculate the number of nodes and number of edges
    for it by working though the hierarchical
    definitions given IF(...)THEN x1 ELSE x2 FI
    y50 IF (...) THEN z1 ELSE DO
    WHILE (...) zz1 OD FI y0

59
References Norman Fenton Software Metrics A
rigorous and practical approach (2nd Ed) N.E.
Fenton and S.L. Pfleeger, Thompson Computer
press, 1997. Martin Shepperd Foundations of
Software Measurement, Prentice-Hall, 1995. These
are both excellent textbooks and each covers all
the material on measurement needed for this
course (and much more).
Write a Comment
User Comments (0)
About PowerShow.com