Software Testing - PowerPoint PPT Presentation

1 / 61

About This Presentation

Title:

Software Testing

Description:

Software Testing There are only 2 hard problems in Computer Science. Naming things, cache invalidation and off-by-one errors. Phil Haack – PowerPoint PPT presentation

Number of Views:208

Avg rating:3.0/5.0

Slides: 62

Provided by: Eddi114

Category:

more less

Transcript and Presenter's Notes

Title: Software Testing

1
Software Testing
There are only 2 hard problems in Computer
Science. Naming things, cache invalidation and
off-by-one errors. Phil Haack
Program testing can be used to show the presence
of bugs, but never to show their
absence! Edsger Dijkstra
2
Outline

Foundations Motivations Terminology
Principles and Concepts
Levels of Testing
Test Process
Techniques
Measures
Deciding when to stop

3
Defects are Bad

At a minimum defects in software annoy users.
Glitchy software reflects poorly on the company
issuing the software.
If defects arent controlled during a software
project, they increase the cost and duration of
the project.
For safety critical systems the consequences can
be even more severe.

4
Spectacular Failures
Ariane 5, 1996Rocket Cargo 500M
Therac-25
Patriot Missile, 1991 Failed to destroy an Iraqi
Scud missile which hit a barracks.
Software defects between 1985 and 1987 lead to 6
accidents. Three patients died as a direct
consequence.
5
Controlling defects in software

There are two ways of dealing with the potential
for defects in software
The most obvious is work to identify and remove
defects that make it into the software.
Another approach that often goes unnoticed is
stop making errors in the first place. In other
words, take action to prevent defects from ever
being injected in the first place. This second
approach is called Defect Prevention.
Testing is one method of uncovering defects in
software. (Inspection is another.)
Testing might not be the most efficient method of
uncovering defects but for many companies it is
their primary means of ensuring quality.

6
What is testing?

Testing is the dynamic execution of the software
for the purpose of uncovering defects.
Testing is one technique for improving product
quality. Dont confuse testing with other,
distinct techniques for improving product
quality
Inspections and reviews (sometimes called static
testing)
Debugging
Defect prevention
Quality assurance
Quality control

7
Testing and its relationshipto other related
activities
8
Benefits of Testing

Testing improves product quality (at least when
the defects that are revealed are fixed),
The rate and number of defects found during
testing gives an indication of overall product
quality. A high rate of defect detection suggests
that product quality is low. Finding few errors
after rigorous testing, increases confidence in
overall product quality. Such information can be
used to decide the release date.
Defect data from testing may suggest
opportunities for process improvement preventing
certain type of defects from being introduced
into future systems.

9
Errors, Faults and Failures! Oh my!

Error or Mistake human action or inaction that
produces an incorrect result
Fault or Defect the manifestation of an error
in code or documentation
Failure an incorrect result.

10
Software Bugs
1947 log book entry for the Harvard Mark II
11
Verification and Validation

Verification and validation are two complementary
testing objectives.
Verification Comparing program outcomes against
a specification. Are we building the product
right?
Validation Comparing program outcomes against
user expectations. Are we building the right
product?
Verification and validation is accomplished using
both dynamic testing and static evaluation (peer
review) techniques.

What are the 3 benefits of testing? In other
words, why test?
Can you have an fault without a failure? Can you
have a fault without an error?

13
Principles of Testing

Program testing can be used to show the presence
of bugs, but never to show their absence!
Edsger Dijkstra He is speaking of course about
non-trivial programs
Mindset is important. The goal of testing is to
demonstrate that the system doesnt work
correctly not that the software meets its
specification. You are trying to break it. If you
approach testing with the attitude of trying to
show that the software works correctly, you might
unconsciously avoid difficult tests that threaten
your assumption.
Should programmers test their own code?

14
OrganizationWho should do the testing?

Developers shouldnt system test their own code.
There is no problem with developers unit testing
their own codethey are probably the most
qualified to do sobut experience shows
programmers are too close to their code in order
to do a good job at system testing their own
code.
Independent testers are more effective.
Levels of independence independent testers on a
team independent of the team independent of the
company.

15
The cost of finding and fixing a defect increases
with the length of time the defect remains in the
product
16
Cost to correct late-stage defects

For large projects, a requirements or design
error is often 100 times more expensive to find
and fix after the software is released than
during the phase the error was injected.

17
Correspondence between Development and different
opportunities for Verification and Validation
18
Two dimensions to testing
19
Levels of testing

Unit testing individual cohesive units
(modules). Usually white-box testing done by the
programmer.
Integration verifying the interaction between
software components. Integration testing is done
on a regular basis during development (possibly
once a day/week/month depending on the
circumstances of the project). Architecture and
design defects typically show up during
integration.
System testing the behavior of the system as a
whole. Testing against the requirements (system
objectives and expected behavior). Also a good
environment for testing non-functional software
requirements such as usability, security,
performance, etc.
Acceptance used to determine if the system
meets its acceptance criteria and is ready for
release.

20
Other types of testing

Regression testing
Alpha and Beta testing limited release of a
product to a few select customers for evaluation
before the general release. The primary purpose
of a beta test isnt to find defects, but rather,
assess how well the software works in the
real-world under a variety of conditions that are
hard to simulate in the lab. Customers
impressions are starting to be formed during beta
testing so the product should have release-like
quality.
Stress testing, load testing etc.
Smoke test a very brief test to determine
whether or not there are obvious problems that
would make more extensive testing futile.

21
Regression Testing

Imagine adding a 24-inch lift kit and monster
truck tires to your sensible sedan

After making the changes you would of course test
the new and modified components, but is that all
that should be tested? Not by a mile!

22
Regression Testing Cont

When making changes to a complex system there is
no reliable way of predicting which components
might be affected. Therefore, it is imperative
that at least a subset of tests be ran on all
components.
In this analogy, that means testing the heater,
air conditioner, radio, cup holders,
speedometerhum, thats interesting, there seems
to be a problem with the speedometer. It
significantly understates the speed of the car.
On closer inspection you discover the speedometer
has a dependency on wheel size. The
implementation for the speedometer makes an
assumption about the wheel size and how far the
car will move for each rotation of the tires.
Larger wheels mean the car is going a greater
distance for each revolution.
Who would have predicted that? Good thing we
performed regression testing.

23
Regression Testing Cont

Making sure new code doesnt break old code.
Regression testing is selective retesting. You
want to ensure that changes or enhancements dont
impair existing functionality.
During regression testing you rerun a subset of
all test cases on old code to make sure new code
hasnt caused old code to regress or stop working
properly.
Its no uncommon for a change in one area of code
to cause a problem in another area. Designs based
on loose coupling can mitigate this tendency but
regression testing is still needed in order to
increase the assurance there were no unintended
consequences of a program change.

24
Testing Objectives

Conformance testing (aka correctness or
functional testing) does the observed behavior
of the software conform to its specification
(SRS)?
Non-functional requirements testing have
non-functional requirements such as usability,
performance and reliability been met?
Regression testing does an addition or change
break existing functionality?
Stress testing how well does the software hold
up under heavy load and extreme circumstances?
Installation testing can the system be
installed and configured with reasonable effort?
Alpha/Beta testing how well does the software
work under the myriad of real-world conditions?
Acceptance testing how well does the software
work in the users environment?

25
Integration Strategies

What doesnt work?
All-at-once or Big Bang waiting until all of
the components are ready before attempting to
build the system for the first time. Not
recommended.
What does work?
Top-Down high-level components are integrated
and tested before low level components are
complete. Example high-level components
life-cycle methods of component framework, screen
flow of web application.
Bottom-Up low-level components are integrated
and tested before top-level components. Example
low-level components abstract interface onto
database, component to display animated image.

OR
26
Advantages of Incremental/ Continuous Integration

Easier to find problems. If there is a problem
during integration testing it is most likely
related to the last component integratedknowing
this usually reduces the amount of code that has
to be examined in order to find the source of the
problem.
Testing can begin sooner. Big bang testing
postpones testing until the whole system is
ready.

27
Top-Down Integration

Stubs and mock objects are substituted for as yet
unavailable lower-level components.
Stubs A stub is a unit of code that simulates
the activity of a missing component. A stub has
the same interface as the low-level component it
emulates, but is missing some or all of its full
implementation. Subs return minimal values to
allow the functioning of top-level components.
Mock Objects mock objects are stubs that
simulate the behavior of real objects. The term
mock object typically implies a bit more
functionality than a stub. A stub may return
pre-arranged responses. A mock object has more
intelligence. It might simulate the behavior of
the real object or make assertions of its own.

28
Bottom-Up Integration

Scaffolding code or drivers are used in place of
high-level code.
One advantage of bottom-up integration is that it
can begin before the system architecture is in
place.
One disadvantage of bottom-up integration is it
postpones testing of system architecture. This is
risky because architecture is a critical aspect
of a software system that needs to be verified
early.

29
Continuous Integration

Top-down and bottom-up is how you are going to
integrate.
Continuous integration is when or how often you
are going to integrate.
Continuous integration frequent integration
where frequent daily, maybe hourly, but not
longer than weekly.
You cant find integration problems early unless
you integrate frequently.

30
Test Process

Test planning
Test case generation
Test environment preparation
Execution
Test results evaluation
Problem reporting
Defect tracking

31
Testing artifacts/products

Test plan who is doing what when.
Test case specification specification of actual
test cases including preconditions, inputs and
expected results.
Test procedure specification how to run test
cases.
Test log results of testing
Test incident report record and track errors.

32
Test Plan

A document describing the scope, approach,
resources, and schedule of intended test
activities. It identifies test items, the
features to be tested, the testing tasks, who
will do each task, and any risks requiring
contingency planning. IEEE std

33
Test Case

A test case consists of a set of input values,
execution preconditions, expected results and
execution post-conditions, developed to cover
certain test condition

34
Oracle

When you run a test there has to be some way of
determining if the test failed.
For every test there needs to be an oracle that
compares expected output to actual output in
order to determine if the test failed.
For tests that are executed manually, the tester
is the oracle. For automated unit tests, actual
and expected results are compared with code.

35
Test Procedure

Detailed instructions for the setup, execution,
and evaluation of results for a given test case.

36
Incident Reporting

What you track depends on what you need to
understand, control and estimate.
Example incident report

37
Testing Strategies

Two very broad testing strategies are
White-Box (Transparent) Test cases are derived
from knowledge of the design and/or
implementation.
Black-Box (Opaque) Test cases are derived from
external software specifications.

38
Testing Strategies
39
Black-Box Techniques

Equivalence Partitioning Tests are divided into
groups according to the criteria that two test
cases are in the same group if both test cases
are likely to find the same error. Classes can be
formed based on inputs or outputs.
Boundary value analysis create test cases with
values that are on the edge of equivalence
partitions

40
Equivalence Partitioning

What test cases would you use to test the
following routine?
// This routine returns true if score is gt
// 50 of possiblePoints, else it returns
false.
// This routine throws an exception if either
// input is negative or score is gt
possiblePoints.
boolean isPassing(int score, int possiblePoints)
ID Input Expected Result
-- ------- ---------------
1 -1,-2 Exception
2 50,100 true
. . .

41
Equivalence Classes

Score/Possible Pts gt 50
Score/Possible Pts lt 50
Score gt Possible Pts
Score lt 0
Possible Pts lt 0

42
Test Cases
Test Case Test Case Data Expected Outcome Classes Covered
1 5,10 True 1
2 30,30 True 1
3 19,40 False 2
4 -1,10 Exception 4

Write test cases covering all valid equivalence
classes. Cover as many valid equivalence classes
as you can with each test case. (Note, there are
no overlapping equivalence classes in this
example.)
Write one and only one test case for each invalid
equivalence class. When testing a value from an
equivalence class that is expected to return an
invalid result all other values should be valid.
You want to isolate tests of invalid equivalence
classes.

43
Boundary Value Analysis

Rather than select any element within an
equivalence class, select values at the edge of
the equivalence class.
For example, given the class 1 lt input lt 12
you would select values -1,1,12,13.

44
Experience-Based Techniques

Error guessing testers anticipate defects
based on experience

45
Testing Effectiveness Metrics

Defect density
Defect removal effectiveness (efficiency)
Code coverage

46
Defect Density

Software engineers often need to quantify how
buggy a piece of software is. Defect counts alone
are not very meaningful though.
Is 12 defects a lot to have in a program? Depends
on the size of the product (as measured by
features or LOC).
12 defects in a 200 line program 60
defects/KLOC ? low quality.
12 defects in a 20,000 line program is .6
defects/KLOC ? high quality.
Defect counts are more interesting (meaningful)
when tracked relative to the size of the software.

47
Defect Density Cont

Defect density is an important measure of
software quality.
Defect density total known defects / size.
Defect density is often measured in defects/KLOC.
(KLOC thousand lines of code)
Dividing by size normalizes the measure which
allows comparison between modules of different
size.
Size is typically measured in LOC or FPs.
Measurement is over a particular time period
(e.g. from system test through one year after
release)
Might calculate defect density after inspections
to decide which modules should be rewritten or
give more focused testing.
Be sure to define LOC. Also, consider weighting
defects. A severe defect is worse than a trivial
on.)
Gives wrong incentive.

48
Defect Density Cont

Defect density measures can be used to track
product quality across multiple releases.

49
Defect removal effectiveness

DRE tells you what percentage of defects that are
present are being found (at a certain point in
time).
Example when you started system test there were
40 errors to be found. You found 30 of them. The
defect removal effectiveness of system test is
30/40 or 75.
The trick of course is calculating the latent
number of errors at any one point in the
development process.
Solution to calculate latent number of errors at
time x, wait a certain period after time x to
learn just how many errors were present at time x.

50
Example Calculation of Defect Removal
Effectiveness
51
Levels of White-Box Code Coverage

Another important testing metric is code
coverage. How thoroughly have paths through the
code been tested.
Options are
Statement coverage
Decision coverage (aka branch coverage)
Condition coverage
Path coverage

52
Statement Coverage

Each line of code is executed.
if (a)
stmt1
if (b)
stmt2
atbt gives statement coverage
atbf doesnt give statement coverage

53
Decision Coverage

Decision coverage is also known as branch
coverage
The boolean condition at every branch point (if,
while, etc) has been evaluated to both T and F.
if (a and b)
stmt1
if (c)
stmt2
atbtct and afb?cf gives decision
coverage

54
Does statement coverage guarantee decision
coverage?

if (a)
stmt1
If no, give an example of input that gives
statement coverage but not decision coverage.

55
Condition Coverage

Each boolean sub-expression at a branch point has
been evaluated to true and false.
if (a and b)
stmt1
at,bt and afbf gives condition coverage

56
Condition Coverage

Does condition coverage guarantee decision
coverage?
if (a and b)
stmt1
If no, give example input that gives condition
coverage but not decision coverage.

57
Path Coverage

In order to achieve path coverage you need a set
of test cases that executes every possible route
through a unit of code.
Path coverage is impractical for all but the most
trivial units of code.

58
Path Coverage

How many paths are there in the following unit of
code?
if (a)
stmt1
if (b)
stmt2
if (c)
stmt3

59
Path Coverage

What inputs (test cases) are needed to achieve
path coverage on the following code fragment?
procedure AddTwoNumbers()
top print Enter two numbers
read a
read b
print ab
if (a ! -1) goto top

60
Deciding when to stop testing

When the marginal cost finding another defect
exceeds the expected loss from that defect.
Both factors (cost of finding another defect and
expected loss from that defect) can only be
estimated.
Stopping criteria is something that should be
determined at the start of a project. Why?

61
Peer Reviews

Inspection
Walkthrough
Pair Programming
Code Review
Technical review vs. management review

62
Old Example

Use equivalence partitioning to define test cases
for the following function
// This function takes integer values for day,
// month and year and returns the day of the
// week in string format. The function returns
// an empty string when given invalid inputs
values.
// Year must be gt 1752.
// Example DayOfWeek(12,31,2009) ? Tuesday
// Example DayOfWeek(13,13,2009) ?
String DayOfWeek(int month, int day, int year)

63
Equivalence Classes

Month lt 1 (invalid)
Month gt 12 (invalid)
Year gt 1752 (valid)
Year lt 1753 (invalid)
Month 1 0 gt Day lt 32 (valid)
Month 1 Day gt 32 (invalid)
Month 4 0 gt Day lt 31 (valid)
Month 4 Day gt 31 (invalid)
Etc

64
Test Cases
Test Case Test Case Data Expected Outcome Classes Covered
1 1,1,2010 Friday 3,5
2 0,1,1999 1
3 45,1,1999 3
4 4,1,1752 8

Write test cases covering all valid equivalence
classes. Cover as many valid equivalence classes
as you can with each test case.
Write one and only one test case for each invalid
equivalence class. When testing a value from an
equivalence class that is expected to return an
invalid result all other values should be valid.
You want to isolate tests of invalid equivalence
classes.

Write a Comment

User Comments (0)