Title: Testing Distributed Software
1Testing Distributed Software
- ECEN5053 Software Engineering of Distributed
Systems - University of Colorado, Boulder
2Outline
- Testing for Software Security if time
- Testing Patterns for Distributed Services -
Binders book - Distributed Software Issues and approaches
- Threads and synchronization
- Path testing
- Life-Cycle
- Models and testing implications
- Test environment
3Bibliography
- Robert V. Binder, Testing Object-Oriented
Systems Models, Patterns, and Tools,
Addison-Wesley Object Technology Series, c. 2000,
ISBN 0-201-80938-9 (1,190 pp.) - John McGregor David Sykes, A Practical Guide to
Testing Object-Oriented Software, Addison-Wesley
Object Technology Series, c. 2001, ISBN
0-201-32564-0 (393 pp.) - Testing Applications On The Web
- Client-Server Testing
- Herbert H. Thompson James Whittaker, Dr. Dobbs
Journal, Testing for Software Security,
November 2002, pp. 24 - 32.
4Defects specific to concurrent distributed sw
- Failure to synchronize accesses to shared data
values can lead to incorrect data values even
though each thread is correctly computing its
result. - A specific node in a distributed system can fail
to perform correctly even though every other node
is performing properly. - A network link between nodes can fail while the
remainder of the system continues to function.
5Basic concepts review thread in Binder
- thread -- independent context of execution within
an operating system process. - Has its own program counter and local data
- Smallest unit of execution that can be scheduled
- Most operating systems allow a single process to
group multiple threads into a related set that
shares some properties and keeps others private. - Sequence of computations
- Simplest testing situation -- address various
entry points and multiple paths through - Multiple threads complication -- share
information - Dependencies implies thread order matters
6Basic concepts review -- thread
- Developer must provide a synchronization
mechanism to ensure correct order - OO languages provide some natural means of
synchronization by - hiding attributes behind interfaces
- sometimes making threads correspond to objects
- synchronization is visible in the object
interface - messaging is a key element in synchronization
- Class testing would not likely detect synch.
defects - Objects must interact to reveal a synch. defect
7Computational Models -- concurrent
- Design assumes multiple things are happening at
the same time - Testing for concurrency defects
- Focus on how two threads interact
- Methods should receive typical testing before
being exercised in an interaction setting - State-driven testing, for example
- Parallel processors in the same box -- were
ignoring this
8Computational Models -- networked
- Physical concurrency achieved by linking together
separate boxes with communication devices - Communication devices operate at slower speed
than the internal data bus - Internet
- Difficulty in synchronizing the many independent
machines - Times of events are measured by local clocks
- Hard to determine thoroughness of testing
9Computational Models -- (truly) distributed
- Multiple processes support a flexible
architecture - Number of participating objects can change
- Objects can be distributed
- across multiple processes on the same machine
- across multiple physical computers
- Components must be able to locate others
- naming service known to all components
- configuration file may list the machines
authorized to participate - part of the infrastructure may be standardized
and reusable on many systems without modification
10How are distributed models different?
- Nondeterminism
- Additional infrastructure
- Partial failures
- Time-outs
- Dynamic nature of the structure
- These basic differences directly impact testing.
11Nondeterminism
- Difficult to exactly replicate a test run
- Determined by scheduler of the operating
system(s) - Changes in programs not associated with the SUT
can affect the order in which threads of the SUT
are executed. - Just because a defect does not reoccur does not
mean it has been repaired correctly. - Leads to certain techniques -- next slide
12Nondeterminism -- continued
- More thorough testing at the class level
- Review should pursue whether adequate provision
for synchronization - Dynamic class testing should determine if
synchronization is working correctly in a
controlled test environment - Execute a large number of cases trying to record
the order in which events occur - Specify a standard test environment --
13Nondeterminism -- continued
- Specify a standard test environment
- Clean machine with minimal shared devices
- Identify apps that must run for viable platform
- Add basic apps that would be on typical machine
- Each test case describes what modifications are
made to this standard environment - include order in which processes are started
- Include a debugger in the standard environment
- Isolate these machines from rest of network at
least initially
14Additional Infrastructure
- Many distributed objects rely on infrastructure
provided by a third-party vendor - Create regression suite to test compatibility
between the app and the infrastructure - Verify against new releases of the infrastructure
- Reconfiguration of the system
- Some infrastructures are self-modifying and
reconfigure themselves when the system
reconfigures itself (two moving targets!) - A specific input data set can cause a different
path to be executed because the previous path no
longer exists
15Partial failures
- May find a portion of the system cannot (does
not) execute because of hardware or software
failures on a machine hosting part of the system. - How do you test this? Failures are simulated by
- removing or disabling network connections
- shutting down a node in the network
16Time-Outs
- Networked systems avoid deadlocks by setting
timers when a request is sent to another system - If no response is received within the allotted
time, the request is abandoned - Software must perform correctly when request is
answered and when it isnt, albeit perhaps
differently - Testing should include a variety of loading
configurations on machines in the network - Testing should cause partial failures to ensure
software works correctly when time-outs are
reached
17Dynamic Nature of the Structure
- Often built with the capability of changing its
configuration - Specific requests may be directed dynamically
depending on the load on various machines - Allow a variable number of machines to
participate - Tests need to be replicated with a variety of
configurations
18Threads
- Design trade-off
- Increasing the number of threads can simplify
algorithms but increase sequencing problems - Reducing the number of threads can make the
software more rigid and less efficient - Synchronization - when 2 threads access the same
memory location, must ensure non-interference - May try to execute a method that modifies a data
value at the same time - Java -- language keyword
- C -- individual developer constructs
- OO language localizes access to the modifier
method for the common data attribute
19Specifying synchronization
- Design documents specify synchronization in the
guard clauses of the UML state diagram - In Java, synchronize appears in the signature of
a method - C -- specify synchronization mechanisms as
classes creation of instance of a monitor object
indicates the location where synch. is needed. - During reviews, verify that synchronization is
specified to occur in the right place, on the
right method, etc.
20Defining useful paths
- A path is a set of logically contiguous
statements that is executed when a specific input
set is used. - Multiple new control structures result in new
paths, possibly indeterminate number or, worse,
infinite. - A path is a set of logically contiguous
statements between the definition (assignment) of
a value for a variable to the places where the
variable is used. - Other definitions of path are more useful in
distributed systems
21Significant paths in distributed software
- Richard Carver and K. C. Tai, 1998
- SYN-event is any action that involves
synchronization of two threads. The spawning of
one thread by another is one example. - SYN-sequence is a sequence of SYN-events that
will occur in a specified order. This is one
type of path through the program code. - Design test cases that correspond to
SYN-sequences. - The main thread is not counted in the number of
paths because it executes regardless of the data
set
22What type of events are SYN-events?
- Creation and destruction of a thread object
- Creation and destruction of an object that
encapsulates a thread - Sending of a message from an object on one thread
to an object on another thread - One object controls another by putting it to
sleep or waking it up - Tester traces paths from one of these events to
another - Minimal SYN-path coverage covers one path through
control statements between one SYN-event and
another
23SYN-paths are necessary, not sufficient
- An object that has its own thread should receive
thorough testing as a class with its aggregated
attributes before being tested in interaction
with other objects. - The SYN-path technique identifies defects related
to synchronization defects. - Use of the SYN-path technique does not replace
the need to use conventional testing to find
defects unrelated to synchronization errors.
24Thread models
- An object has its own personal thread
- An object is visited by an active thread as
necessary - Either case, there must be a mechanism to prevent
multiple threads from operating in the same
modifier method at the same time. - Beware the differences in
- Java thread behavior among operating systems
- Windows, Sun, Unix, MacOS -- a cross section
- Variants of Unix
- Different models of workstations w/ different
options installed
25Design for Testability
- In languages with single inheritance, e.g. Java,
the root of every inheritance hierarchy should be
an interface. - The test harness must inherit from a tester
parent class to also implement the interface. - public class TimerTester extends Tester
implements TimeObservableIf - Whenever possible, define a default constructor
that can be used during unit testing without
requiring dependencies on many other classes.
The default constructor need not be public.
26Life-cycle Testing in a distributed system
- This life cycle is measured by the lifetime of
the infrastructure components instantiated to
support the system. - Test plan
- Test run starting from nothing instantiated,
followed by bringing the system up, executing a
series of actions, and finally bringing the
system completely down. - Did each action complete successfully?
- Were all resources allocated by the system
released after the system was terminated? - Can the system be restarted successfully?
27More on life-cycle test plan
- Many paths through a complete life-cycle
- Select representative paths to provide maximum
coverage - Effective life-cycle tests validate correct
answers and - Item being tested interacts correctly with its
environment - All acquired resources have been released
- Other elements with which the tested piece
interacts have been left in correct states.
28Basic Client/Server Model simplest distributed
system
- Multiple clients all have access to the server
- Single point of failure -- the server
- What should be tested?
- Can server deliver correct results to the correct
client under steady load of a moderate number of
requests simultaneously over an extended period
of time? - Can the server correctly handle a rapidly
increasing load? Test set should present a large
number of test cases at increasing arrival rates.
29Commercial infrastructures
- Simple client/server model generalized to
eliminate single point of failure - Client can select among multiple servers
- Models abstract away the networking details
reducing the error rate of the former primitive
pipe- and-socket structures - Commercial infrastructures hide the communication
details and support testing systems using the
models - CORBA
- RMI
- Web Services
30CORBA - as a style
- Central element is an object request broker (ORB)
- One object uses this to communicate with other
objects - A CORBA-compliant system provides services
- that allow one object to find others based on
objects being requested, location, or load - needed to connect two objects written in
different languages or executing on different
types of machines - Multiple vendors provide CORBA with competitive
differences
31CORBA -- continued
- CORBA standard assumes
- machines may have different o.s.s and different
memory layout - components that comprise the distributed system
may be written in different languages - infrastructure may change its configuration based
on the distribution of objects and the types of
machines in the network - Flexible
32CORBA style testing implications
- Does the system work correctly regardless of its
infrastructure configuration? - Can the test cases be made more reusable by
building them based on the services of the
standard infrastructure? - Does a specific new release of the infrastructure
integrate efficiently with existing applications? - Regression test suite and test harness allows new
releases of the infrastructure to be tested prior
to being integrated into products
33RMI -- remote method invocation
- Java provides a simplified distributed
environment that assumes every machine is running
a JVM - Structure is similar to CORBA
- simpler because of less flexible assumptions
- A registry (broker) object is provided
- All participating objects must know what port the
registry listens to for messages - Testing implications
- Which CORBA test patterns can be used in
RMI-based systems? Test cases may be structured
the same as many CORBA test cases.
34A Generic Distributed Component Model
35Review of process communication
Generic architecture
service provider
service requester
surrogate for provider - stub
surrogate for requester - skeleton
Communication and location services
Process B
Process A
36Test the Requester
- Stimulus
- Most of its behaviors have been previously tested
using class testing techniques - Still need to test timing requirements
- Requester sends asynchronous msg (1-way), test
cases must investigate the effect of the length
of the time it takes to receive a reply. - Sender immediately proceeds
- May be written to expect answer within window
may not be written to wait if not arrived by
then. - Test this interaction under different amounts of
latency (different load conditions)
37Provider -- central role
- Performs behaviors and sometimes returns
information to the requester. - Complete interface of the provider can be tested
using the basic class testing techniques. - Behaviors expected to be invoked by other
distributed objects require specialized testing
(stay tuned!) - Provider is registered with the infrastructure
with information about its services.
38The many faces of the Provider
- May be an object waiting actively for a request
to be received. - May not be, too.
- First, instantiated
- Then the request is forwarded to it.
- Source of timing differences.
- Any provider that can be dynamically instantiated
upon request should be exercised using test cases
starting from - instantiated scenarios
- non-instantiated scenarios.
39Stubs and skeletons
- Stub -- surrogate for the provider in the
requester process. - Keeps requester from knowing semantics of the
infrastructure - Some infrastructures are smart enough to
reconfigure themselves depending on whether the
two objects are actually - in the same process
- in different processes on the same machine
- on different machines with different
architectures - Reconfiguring involves adding or removing stubs,
skeletons, or other method calls
40Stubs and Skeletons
- Skeleton -- surrogate for the requester in the
provider process. - Keeps provider from knowing semantics of the
infrastructure - If infrastructure reconfigures itself to add or
remove stubs, skeletons, or other method calls,
this changes the path through which the request
travels. - Interaction test suites should be designed to
execute a set of tests over the path
corresponding to each possible configuration.
41Specifying Distributed Objects -- IDL
- The spec for service providers is usually written
in an interface definition language (IDL). - simpler than a programming language
- provides information useful for testing purposes
- Signature
- Main portion of the IDL spec is the usual
signature for a method. - Method name, types of its params, return type
- Standard techniques for sampling these values
- (continued ...)
42Specifying Distributed Objects -- IDL
- One-way -- asynchronous message
- Must be tested over a complete life cycle
- Requester may need the requested information
before the provider sends it. - Message may result in an exception being thrown
by the provider. - Tests should specifically investigate whether
exceptions are caught in the correct object. - In, out -- defines if requester is to provide
this information or expect the parameter to be
modified by the provider. - (continued)
43Specifying Distributed Objects -- IDL
- in, out (continued)
- Tests of a method specifying an out parameter
must - locate the returned object (because most OO
languages do not handle this case gracefully) - verify that it has the correct state
44Pre- and Post conditions and Invariants
- Remember them??
- Last semester, looked at building test cases
based on pre- and post conditions - Distributed components should be designed not to
know their location relative to other components - Components DO have to know about the expanded set
of possible errors. - Post conditions now include exceptions for
scenarios such as a service that is not available
from the specified provider - What else?
45Test cases for implicit, ever-present errors
- Ever-present errors are those that the
developer doesnt bother to state in the post
condition as a possible outcome because,
Everybody knows that! - Test the requester for proper handling of
- Provider not found
- Provider busy -- time-out
- Null provider reference
- Any null pointer is a problem but some
infrastructures invalidate a pointer after some
amount of inactivity. Oops. - Null out-parameter
- Include in test a checklist for the type of class
being constructed
46Temporal Testing
47Test Environment
- Using a wrapper to test the OUT interface of a
class
48Test Environment
- Interaction testing
- 2 objects distributed in separate processes
- Testing a complete protocol is important
- In distributed systems ... whether msgs are
really received in an order even if they are sent
in the sequence described by the protocol. - more to this ...
49Model-Specific Test Cases
- Basic Client/Server Model
50Model-Specific Test Cases -- cont.
- Generic Distribution Model
- Different models make different assumptions about
the type of application and the deployment
environment - Tests should focus on these assumptions
- Some can be tested during guided inspections
51More re Generic Distribution Model
- Language issues
- RMI assumes Java CORBA assumes nothing
- They will probably work correctly, the app may
not - Doc may not explain well what is possible re data
types used in classes written in different
languages - e.g. Pass an array between Java requester and a
C provider - Variation between CORBA types and Java types is
greater than between CORBA and C - Are return objects handled properly?
52More re Generic Distribution Model
- Platform Independence Issues
- Distribution models are platform independent
- Implicit requirements about size of available
memory or processor speed cause the software to
work differently in different environments - Infrastructure Tests -- situations can corrupt
the infrastructure - Stubs and skeletons produced automatically by an
IDL compiler ... and subsequently edited by
developers
53More re Generic Distribution Model
- Compatibility Tests
- New releases of the infrastructure should be
tested against existing release of the app - Failure Recovery -- partial failure
- Configure a system with
- a main machine where the locator portion of the
infrastructure is running - and for which a server is instantiated on a
specific machine - When server registers, disconnect network cable
of that server - From main, attempt to use that servers services
...
54More re Generic Distribution Model
- Dynamic modification of infrastructure
- CORBA allows modifications during program
execution - Changes configuration of the system
- Can change timing and execution paths
- Try a variety for appropriate situations given
the components that are dynamically available
55More re Generic Distribution Model
- Logic-specific Test Cases
- With asynchronous messages, events may occur in a
variety of sequences - Order of returns to multiple requests varies
- Design must assume it makes no difference
- Test must try as many combinations as possible
- Requested Object Unavailable
- User-entered names can be misspelled or refer to
a resource that no longer exists - Handle thrown exception in right place in
requester abort operation gracefully give user
another chance
56Testing the Internet
- Large, dynamic, distributed system
- Servers added and removed continuously
- Applications must operate in the presence of
partial failures - missing systems
- nonexistent addresses
- Some form basic business environments for the
company running the Web site - specific, stringent requirements for reliability
and security
57McGregor Sykes Dont Cover
- Performance
- See Event Sequence Analysis to measure actual
performance and to know where narrow margins are
for meeting performance requirements. - Security
- Can test features intended to increase security
- Can test common sources of technical security
compromises - Very hard to test the human weak links