Evaluation of Information Systems Complexity Metrics and Models - PowerPoint PPT Presentation

1 / 50
About This Presentation
Title:

Evaluation of Information Systems Complexity Metrics and Models

Description:

Complexity metrics were developed by computer scientists and software engineers ... Simple complexity metric, often based on number of ... Availability Metrics ... – PowerPoint PPT presentation

Number of Views:224
Avg rating:3.0/5.0
Slides: 51
Provided by: Gle780
Category:

less

Transcript and Presenter's Notes

Title: Evaluation of Information Systems Complexity Metrics and Models


1
Evaluation of Information SystemsComplexity
Metrics and Models
  • INFO 630
  • Glenn Booker

2
Origin
  • Complexity metrics were developed by computer
    scientists and software engineers
  • Strongly based on empirical (real world)
    measurement, with little theory
  • Primarily broken into internal and external
    measures

3
Internal versus External
  • Internal measures describe the complexity within
    a module (number of decisions, loops,
    calculations, etc.)
  • External measures describe relationships among
    modules (program or function calls, external
    file activities, input/output, etc.)

4
Internal Measures
5
Internal Product Attributes
  • Size measures
  • Input to prediction models
  • Normalizing factor for cost, productivity, etc.
  • Progress during development
  • Typically use lines of code (LOC) or function
    point counts
  • LOC is a better measure for predicting cost and
    schedule

6
Lines of Code
  • Simple complexity metric, often based on number
    of executable statements or instruction
    statements
  • Highest defect rates often occurs in small
    modules
  • Larger modules have a smaller defect rate (if
    they exist at all) - until too cumbersome
  • Optimum module size 250 lines

7
Function Points
  • Function points help avoid biases due to the
    programming language(s) used
  • Provide a more fair basis for comparing
    different environments
  • Focuses on how much work the program
    accomplishes, not how concisely it is expressed

8
Halstead Metrics
  • Also known as Software Science, 1977
  • Examine program as compilable tokens
  • Tokens are either operators (, -) or operands
    (variables)
  • Derive metrics such as Vocabulary, Length,
    Volume, Difficulty, etc.
  • Not widely used

9
Data Structure (Halstead)
  • Halsteads ?2 - number of distinct operands in a
    module
  • Operands include number of variables, number
    unique constants, and number of labels
  • Operand usage (OU)
  • OU ?2/N2 where N2 is the total number of
    operand references

10
Software Complexity
  • Is a characteristic that influences the
    resources needed to build and maintain it
  • Many different characteristics of software relate
    to complexity
  • These complexity characteristics revolve around
    the structure of the software

11
Types of Structural Measures
  • Control flow
  • Addresses sequence in which instructions are
    executed
  • Iteration and looping
  • Data flow
  • Follows trail of data as it is created and
    handled
  • Depicts behavior of data as it interacts with the
    program

12
Types of Structural Measures
  • Data structure
  • Concerned with organization of data itself
  • Provides information about difficulties in
    handling data and in defining test cases

13
Control Flow
  • Modeled by directed graphs (control flow graphs)
  • Each node corresponds to a single program
    statement
  • Arcs (directed edges) indicate flow of control
    from one statement to another

14
Control Flow
  • Control flow graphs are useful for
  • Analysis (estimating number of defects)
  • Expressing complexity by a single value
  • Assessing testability and test coverage

15
Basic Control Constructs
16
Cyclomatic Complexity
  • McCabe, 1976
  • Based on a programs control flow chart
  • Related to number of separate graphable areas, or
    number of linearly independent paths in the
    program
  • Complexity MC edges - nodes 2( of
    unconnected paths)

17
Cyclomatic Complexity
  • Complexity under 10 generally desired
  • Can also find M as number of binary decisions
    (yes/no) minus one
  • Multiple choice decisions with n choices count
    as (n-1) binary decisions
  • Ignores differences between specific types of
    control structures

18
Cyclomatic Complexity
  • Uses of complexity metric
  • Identify complex modules needing detailed
    inspection or redesign
  • Identify simple modules needing minimal
    inspection and/or testing
  • Estimate programming, testing and maintenance
    effort
  • Identify potentially troublesome code

19
Control Flow Representation of Programs
  • Software programs can be represented by linear
    directed segments combined with the basic
    control flow constructs
  • Control flow constructs may be nested, e.g. an IF
    statement can be inside of a WHILE loop

20
Control Flow Representation of Programs
  • Example

21
Control Flow--Linearly Independent Paths
Set of linearly independent paths b1 abcg
b2 abcbcg b3 abefg b4 adefg
b5 adfg Any arbitrary path is equal to a linear
combination of the linearly independent
paths listed above For example, path abcbefg is
equal to b2 b3 - b1
22
Knots - Control Flow Crossovers
  • Knot measure -- total number of points at which
    control flow lines cross

23
Syntactic Constructs
  • Examine effect of using specific control
    structures on defect rate
  • Is, by definition, language-specific
  • Can result in statistically significant
    relationships
  • e.g. Lo used to show that DO WHILE should be
    avoided in COBOL

24
External Measures
25
Computational Complexity
  • Examines algorithmic efficiency and use of
    machine resources (memory, I/O, storage)
  • Studies quantitative aspects of solutions to
    computational problems
  • Examples may include sorting efficiency for a
    database, managing I/O constraints across a large
    scale network, etc.

26
Psychological Complexity
  • Concerned with characteristics of software that
    affect human performance
  • Injection of defects (when and why does a
    programmer make errors?)
  • Ease of building the software (effort required)
  • Ease of maintenance (effort required)

27
Data Structure (Database)
  • Database size per program size (DBSPPS)
  • DBSPPS DBS/PS
  • Where DBS is database size in bytes or
    characters
  • PS is program size in source instructions
  • Used in COCOMO model as a cost driver
  • Ordinal scale measure derived from DBSPPS

28
Fan-in and Fan-out
  • Focus is the interaction among code modules
  • Fan-in of modules which call a given module
  • Fan-out of modules which are called by a
    given module
  • Or, more formally...

29
Fan-in and Fan-out
  • Fan-in of a module is the number of local flows
    terminating at the module, plus the number of
    data structures from which info is retrieved by
    the module
  • Fan-out of a module is the number of local flows
    that emanate from the module, plus the number of
    data structures (tables, arrays) that are updated
    by the module

30
Fan-in and Fan-out
  • Do fan-in and fan-out affect software quality?
  • Large fan-in modules may be interpolation or
    look-up routines - no defect correlation
  • Large fan-out often relates to high defect rate -
    has a high defect correlation
  • Large fan-in and fan-out is clearly bad

31
Fan-in and Fan-out
  • Information flow complexity
  • Henry and Kafura Size(fan-in fan-out)2
  • Shepperd (fan-in fan-out)2
  • Henry and Kafura measure helps predict the number
    of software maintenance problems
  • Shepperd measure correlates with software
    development time

Henry, S. and D. Kafura, IEEE Transactions on
Software Engineering, 1981. SE-7(5) p. 510-518
Shepperd, M. 1990. Software Engineering Journal
5, 1 (January), pp. 3-10.
32
Structure Metrics
  • Information flow metric (Henry Selig)
  • HC C (fan-in fan-out)2
  • where C is the cyclometric complexity

33
Structure Metrics
  • System complexity (Card Glass)
  • Based on structural complexity (average fan-out
    squared) and data complexity (based on number of
    I/O variables and fan-out)
  • Quantified effect of complexity on error rate

34
Module Call Graph
  • Module - a contiguous sequence of program
    statements, bounded by boundary elements, having
    an aggregate identifier
  • Or, a distinct, named group of LOC
  • The module call graph shows which modules call
    each other, and what key information is passed
    among them

35
Module Call Graph
  • Example

36
Module Coupling Measures
  • Average number of calls per module (ANCPM)
  • Fraction of modules that make calls (FMC)

37
Information Flow Measures
  • Types of information flows
  • Local direct flow
  • Module invokes a 2nd module passes info to it
  • Invoked module returns result to the caller
  • Local indirect flow
  • Invoked module returns info that is subsequently
    passed to a second invoked module
  • Global flow
  • Info flows from one module to another via a
    global data structure

38
IEEE-STD-982
  • Number of Entries and Exits per Module, m
  • Like fan-in and fan-out
  • m entries exits
  • Software Science measures

39
IEEE-STD-982
  • Graph-Theoretic Complexity
  • Static ComplexityC Edges - Nodes 1
  • Generalized Static ComplexityBased on summing
    resources needed for each module (e.g. storage,
    access time, etc.)
  • Dynamic complexityComplexity as it changes over
    time across a network

40
IEEE-STD-982
  • Cyclomatic complexity
  • Minimal Unit Test Case Determination
  • Determine number of independent paths through a
    module, to get minimum number of test cases for
    unit testing
  • Data or information flow complexity
  • Fan-in and fan-out of variables

41
IEEE-STD-982
  • Design Structure
  • Adds weighted () average of six parameters
  • Whether designed top down (Y/N)
  • Module inter-dependence
  • Module dependence on prior processing
  • Database size ( of elements)
  • Database compartmentalization
  • Module single entrance and exit (Y/N)
  • Weighting chosen to meet project needs

42
Other Measures
  • Compiler measures
  • Size (bytes of compiled code)
  • Number of symbols and variables
  • Cross-reference of all labels
  • Statement count

43
Other Measures
  • Configuration Management Library Measures
  • Number of code modules
  • Number of versions of each module
  • History of change dates of each module
  • Module size
  • Number of related documents for each module

44
Availability Metrics
  • Most information systems are critical to
    day-to-day operations
  • Witness the recent crash of Google making news
    for only 15 minutes of non-availability
  • Availability depends on 1) how often the system
    goes down, and 2) how long it takes to restore it
    after a crash

45
Availability Metrics
  • Perfect availability (100) is nice to dream of,
    but realistically, higher reliability is more
    expensive
  • Often measure availability by the number of 9s
    in the desired level of availability
  • Two nines is 99, three nines is 99.9, four
    nines is 99.99, etc.

46
Availability Metrics
47
Achieving High Availability
  • Many techniques are used to help ensure that high
    levels of availability are possible
  • Duplicate systems (clustering)
  • RAID data duplication
  • Duplicate power supplies
  • Independent power supplies
  • Uninterruptible power supplies (UPS)

48
Availability and Code Quality
  • Capers Jones demonstrated a clear connection
    between code quality (defect rate) and the
    corresponding mean time to failure (MTTF), which
    is a key aspect of availability
  • Consistent methods for measurement and
    definitions of terms are needed for further
    refinement

49
Customer Outage Data
  • In order to determine availability, the actual
    customer-visible system outage time needs to be
    collected
  • In order to get this data, the customer must
    place a very high priority on availability
  • This data could be used to identify software
    components which most reduce availability

50
Availability
  • We also expect that availability for a new system
    should increase over the first couple years of
    its use
  • Defect causal analysis can help reduce the root
    cause of defects, thereby improving availability
Write a Comment
User Comments (0)
About PowerShow.com