Title: Parallel Execution of Test Runs for Database Application Systems
1Parallel Execution of Test Runs for Database
Application Systems
- Donald Kossmann ETH Zurich, i-TV-T AG
- Florian Haftmann i-TV-T AG
- Eric Lo ETH Zurich
2Some Facts
- Microsoft spends 50 of their development cost on
testing - SAP product cycle 18 months
- 6 months to execute tests
-
- Testing is the most expensive phase of the
software development cycle
3Observation
- The more test runs, the better
- However, it
- takes more time!
- Goal Optimize Testing Time
4Definition Test Run Ti
- A sequence of requests
- Test Run Login (2 requests)
Req Action Value Expected Result
1 Fill-in Login Fill-in Password Eric
2 Click Sign-in
5Expected Result
Req Action Value Expected Result
1 Fill-in ID Fill-in Password Eric
2 Click Sign-in
6More Definitions
- Failed Test Run At least one request does not
return the expected result - Test Database D The state of an Application
Database at the beginning of each test - Database Reset R Bring the database back to D
7A Test Run Fails When
- The application has a real bug
- Or the test database is in wrong state due to
execution of test runs - ? Carry out resets to find real bugs
8Resetting the Test Database?
- P.O. Insertion
- Count P.O.
-
DatabasePurchaseOrderP1
9Resetting the Test Database?
- P.O. Insertion
- Count P.O.
-
TA Insert Purchase Order P2
DatabasePurchaseOrderP1
10Resetting the Test Database?
TA Insert Purchase Order P2
- P.O. Insertion
- Count P.O.
-
ltTA Successgt
DatabasePurchaseOrderP1
P2
11Resetting the Test Database?
- P.O. Insertion
- Count P.O.
-
TB Get Total Purchase Order Expected Result 1
Actual Result 1 ltTB Successgt
DatabasePurchaseOrderP1
12Resetting the Test Database?
TA Insert Purchase Order P2
- P.O. Insertion
- Count P.O.
-
TB Get Total Purchase OrderExpected Result 1
Actual Result 2 ltTB Failsgt
DatabasePurchaseOrderP1
P2
13Database Reset is needed!
- P.O. Insertion
- Count P.O.
-
TA Insert Purchase Order P2
DatabasePurchaseOrderP1
P2
14Database Reset
- Resetting a database for a large scale
application takes about 2 minutes! - Back-of-the-envelop calculation
- 10000 test runs 10000 resets x 2 min 2
weeks on DB resets for 1 complete test
15Reordering Test Runs
- P.O. Insertion
- Count P.O.
-
DatabasePurchaseOrder P1, P2
16Reordering Test Runs
- P.O. Insertion
- Count P.O.
-
DatabasePurchaseOrder P1
17Order Matters!
- P.O. Insertion
- Count P.O.
-
DatabasePurchaseOrder P1, P2
18Our Previous Work (CIDR 2005)
- A test run depends on a correct state of a
database - Control the database state
- Reduce the number of database resets
- Algorithms to optimize order of test runs
- No parallelism in testing
19- Can we do better if
- we have gt 1 machine?
20Parallel Testing is a Two-dimensional Problem!
- Fully utilize the available resources
- Load Balancing!
- Same as single machine, we still have to control
the database state - Reduce the database resets!
21More about the Problem
- Regression test
- Later stage of the development cycle
- Minor changes between versions
- Execute the same set of test runs
- Version 1.1
- Execute test T1 T2 T3 T4
- Version 1.2 (Bug fixed and/or minor changes)
- Execute test T1 T2 T3 T4
22Parallel TestingShared-Nothing vs.
Shared-Database
23Shared-Nothing (SN)
...
...
- If I work for IBM, I can install
- N applications
- N databases
- N machines
- One more machine
- More admin. work!
- More license fees!
- Applications do not SHARE the database
T4
T31
T12
T5
Application
Application
...
Database
Database
Machine 1
Machine N
24Shared-Database (SDB)
Thread N
Thread 1
- If I work for PoorEric.com, I install
- N threads (e.g., open N browsers)
- 1 database
- 1 machine
- The threads SHARE the database
- Test runs interference with each others
- Cant scale as good as Shared-Nothing
...
...
T4
T31
...
T12
T5
Application
Database
25Parallel Testing Framework
Application
History
T2
T1
M1
...
Database
Reset?
T5
MN
...
Machine/Thread 1
Scheduler
...
T1
T5
T2
T6
Reset?
Application
Conflicts DB
Database
Machine/Thread N
26Parallel Testing is a Two-dimensional Problem!
- Fully utilize the available resources
- Load Balancing!
- Same as single machine, we still have to control
the database state - Reduce the database resets!
27Execution Strategies
- Optimistic Execution
- Reset the database only when it is a must
- Example R T1 T2 T3 T4
- Optimistic Execution
- Avoid to execute a test run twice, again
- Example (Wk 1) R T1 T2 T3 T4 R T4 T5
- Example (Wk 2) R T1 T2 T3 (Next is T4 ?)
- Slice Reordering Heuristics
- Slice A sequence of test runs without conflicts
- Example R T1 T2 T3 T4 R T4 T5
- Collect ltslicegts during each test
- Graph Reordering Heuristics
R T4 T5
- R T4 T5
ltT1 T2 T3gt?T4
28Parallel TestingShared-Nothing (SN)
29Shared-Nothing
Application
Reset
Database
Scheduler
Machine 1
...
T1
T5
T2
T6
Test Run Input Queue
Reset
Application
Conflicts DB
Database
Machine 2
30Test 1
M1 R
Scheduler
T1
T5
T2
T6
T3
T7
T8
Test Run Input Queue
M2 R
Conflicts DB
31Test 1
M1 R
T1
Scheduler
T5
T2
T6
T3
T7
T8
Test Run Input Queue
M2 R
Conflicts DB
32Test 1
M1 R
T1
Scheduler
T2
T6
T3
T7
T8
Test Run Input Queue
M2 R
T5
Conflicts DB
33Test 1
M1 R
T1
T2
Scheduler
T6
T3
T7
T8
Test Run Input Queue
M2 R
T5
Conflicts DB
34Test 1
M1 R
T1
T2
Scheduler
T3
T7
T8
Test Run Input Queue
M2 R
T5
T6
Conflicts DB
35Test 1
M1 R
T1
T2
T3
Scheduler
T7
T8
Test Run Input Queue
M2 R
T5
T6
Conflicts DB
36Test 1
M1 R
T1
T2
T3
Scheduler
T7
T8
Test Run Input Queue
M2 R
T5
T6
R
Conflicts DB
37Test 1
M1 R
T1
T2
T3
Scheduler
T7
T8
Test Run Input Queue
M2 R
T5
T6
R
T6
Conflicts DB
T5?T6
38Test 1
M1 R
T1
T2
T3
R
Scheduler
T7
T8
Test Run Input Queue
M2 R
T5
T6
R
T6
Conflicts DB
T5?T6
T1T2?T3
39Test 1
M1 R
T1
T2
T3
R
Scheduler
T8
Test Run Input Queue
M2 R
T5
T6
T7
R
T6
Conflicts DB
T5?T6
T1T2?T3
40Test 1
M1 R
T1
T2
T3
R
Scheduler
Test Run Input Queue
M2 R
T5
T6
T7
T8
R
T6
Conflicts DB
T5?T6
T1T2?T3
41Test 1
M1 R
T1
T2
T3
R
T3
Scheduler
Test Run Input Queue
M2 R
T5
T6
T7
T8
R
T6
Conflicts DB
T5?T6
T1T2?T3
42Shared-Nothing - Slice
- 3 major principles
- The slices in the input queue are ordered by
- Reordering the slices on each machine locally
- Merge the partial order
- Executes all test runs of the same slice on the
same machine - The scheduler makes sure conflicting slices are
executed on different machines as much as possible
43Collect Slices
M1 R
T1
T2
T3
R
T3
Scheduler
Test Run Input Queue
M2 R
T5
T6
T7
T8
R
T6
Conflicts DB
T5?T6
T1T2?T3
44Reordering Slices
M1 R
T3
R
Local Order M1
Local Order M2
M2 R
T6
R
45Merge Partial Order
M1 R
T3
R
Local Order M1
Local Order M2
M2 R
T6
R
Test Run Input Queue
46Shared-Nothing - Slice
- 3 major principles
- The slices in the input queue are ordered by
- Reordering the slices on each machine locally
- Merge the partial order
- Executes all test runs of the same slice on the
same machine - The scheduler makes sure conflicting slices are
executed on different machines as much as possible
47Test 10
M1 R
Scheduler
T6 T7 T8
T3
T1 T2
T5
Test Run Input Queue
M2 R
Conflicts DB
T5?T6
T1T2?T3
T3?T1
48Test 10
M1 R T3
T3
Scheduler
T6 T7 T8
T1 T2
T5
Test Run Input Queue
M2 R
Conflicts DB
T5?T6
T1T2?T3
T3?T1
49Test 10
M1 R T3
T3
Scheduler
T1 T2
T5
Test Run Input Queue
M2 R
T6 T7 T8
Conflicts DB
T5?T6
T1T2?T3
T3?T1
50Test 10
M1 R T3
T3
Conflict?
Scheduler
T1 T2
T5
Test Run Input Queue
M2 R
T6 T7 T8
Conflicts DB
T5?T6
T1T2?T3
T3?T1
51Test 10
M1 R T3
T3
Conflict?
Scheduler
T1 T2
T5
Test Run Input Queue
M2 R
T6 T7 T8
Conflicts DB
T5?T6
T1T2?T3
T3?T1
52Test 10
M1 R T3
T3
Conflict?
Scheduler
T5
T1 T2
Test Run Input Queue
M2 R
T6 T7 T8
Conflicts DB
T5?T6
T1T2?T3
T3?T1
53Test 10
M1 R T3
T3
T5
Scheduler
T1 T2
Test Run Input Queue
M2 R
T6 T7 T8
Conflicts DB
T5?T6
T1T2?T3
T3?T1
54Test 10
M1 R T3
T3
T5
Scheduler
Test Run Input Queue
M2 R
T6 T7 T8
T1 T2
Conflicts DB
T5?T6
T1T2?T3
T3?T1
55Parallel TestingShared-Database (SDB)
56Shared-Database
Application
Reset
Thread 1
Scheduler
T6
T2
T5
T1
Database
...
Test Run Input Queue
Reset
Conflicts DB ltT1 T5 T6 gt ? T2
Application
...
Thread 2
57Shared-Database, Slice
- Similar to Shared-Nothing
- Different definition of a slice
- Different scheduling decisions
58Performance Experiments
- Simulation
- 10,000 test runs (0 min 3 min)
- 10,000 (low) 5M (high) conflicts
- Uniform Zipf distribution
- SN 1 to 50 machines
- SDB 1 to 10 threads
- Real data 61 test runs
- Reporting average running time/reset of the last
10 tests (total 30 tests)
59Shared-DB (Real Data)
Approach 1 thread 1 thread 5 threads 5 threads 10 threads 10 threads
Approach Time Reset Time Reset Time Reset
Optimistic 41 7 22 6.6 16 5.8
Graph(MWD) 37 3.5 19 4.2 13 4.2
Slice 31 3 18 3.8 12 4.2
Time unit minute
60Shared-DB (Real Data)
Approach 1 thread 1 thread 5 threads 5 threads 10 threads 10 threads
Approach Time Reset Time Reset Time Reset
Optimistic 41 7 22 6.6 16 5.8
Graph(MWD) 37 3.5 19 4.2 13 4.2
Slice 31 3 18 3.8 12 4.2
Time unit minute
61Experiment Summary
- Shared-Nothing (SN)
- Linear scale-up, sometimes super-linear
- Shared-Database (SDB)
- Scales up to 10 threads
- Heuristics
- Slice is the winner
- How about other distribution (e.g., Zipf)?
- Similar results
62Conclusions and Future Work
- Parallel execution of test runs?
- It SCALES!
- Studied a dynamic scheduling approach for SN and
SDB architecture - Control the database state ? Minimize DB resets
- and Load balancing
- How to generate test runs and test data for
database application programs? - More in the paper
63Thank YouMain contact eric.lo_at_inf.ethz.ch
64Parallel Testing Framework
History
Application
T7
M1
...
Reset?
Database
T9
T25
T13
MN
...
T17
Scheduler
Machine/Thread 1
...
T4
T31
T5
T12
Test Run Input Queue
Reset?
Application
Conflicts DB
T8
Database
Machine/Thread N
65Example Shared-Nothing, Slice
Application
Test 1 M1 R T1 T4 ... R ...
M2 R T2 T3 T5 R ...
Test 1 M1 R T1 T2 T3 R T3 M2 R T5 T6 R T6 T7
T8
Reset
Database
Scheduler
Machine 1
T6
T2
T5
T1
...
Test Run Input Queue
Reset
Application
Conflicts DB ltT1 T2gt ? T3
...
Database
Machine 2
66Shared-Database, Slice
Application
Test 1
Test 1 M1 R T1 T4 ... R ...
M2 R T2 T3 T5 R ...
Th1 T1 T2 T2 T3 R R
R Th2 T5 T6 T7 T8 T8
Reset
Thread 1
Scheduler
T6
T2
T5
T1
Database
...
Test Run Input Queue
Reset
Conflicts DB ltT1 T5 T6 gt ? T2
Application
...
Thread 2
67Shared-Database, Slice - Test 1
Application
Test 1
Test 1 M1 R T1 T4 ... R ...
M2 R T2 T3 T5 R ...
Th1 T1 T2 T2 T3 R R
R Th2 T5 T6 T7 T8 T8
Reset
Thread 1
Scheduler
T6
T2
T5
T1
Database
...
Test Run Input Queue
Reset
Conflicts DB ltT1 T5 T6 gt ? T2
Application
...
Thread 2
68SDB Subsequent Tests
Reordering
T2 T7 T3
T8
T1 T5 T6
Test N
Test Run Input Queue
69Additional Issues - SDB
- How to do a database reset when a test run fails?
- Deferred
- The database reset is deferred and the failed
test run is re-scheduled at the end - Eager
- Abort all concurrent test runs and reset
immediately - Lazy
- Do not accept new test run, let active test runs
finished and reset.
70Shared-Nothing Performance
- Achieve linear scale-up?
- Yes
- The best among the three
- Slice
- How about low conflict?
- Similar results
- How about other distribution (e.g., Zipf)?
- Similar results
71Shared-Database Performance
- Scale-up if increasing the number of threads?
- Yes, up to 10 threads
- If number of conflicts is high, gt 10 test
threads might hurt performance - The best among the three
- Slice
72SN Simulation (High Conflict)
Approach 1 machines 1 machines 5 machines 5 machines 10 machines 10 machines 50 machines 50 machines
Approach Time Reset Time Reset Time Reset Time Reset
Optimistic 358 1788 72 1787 36 1775 6.8 1753
Slice 306 867 64 1098 32 1038 6.4 1048
Graph(MWD) 359 1792 71 1784 36 1780 7.6 1767
Time unit hour
73SN Simulation (High Conflict)
Approach 1 machines 1 machines 5 machines 5 machines 10 machines 10 machines 50 machines 50 machines
Approach Time Reset Time Reset Time Reset Time Reset
Optimistic 358 1788 72 1787 36 1775 6.8 1753
Slice 306 867 64 1098 32 1038 6.4 1048
Graph(MWD) 359 1792 71 1784 36 1780 7.6 1767
Time unit hour
74SDB Simulation
Approach 1 thread 1 thread 5 threads 5 threads 10 threads 10 threads 50 threads 50 threads
Approach Time Reset Time Reset Time Reset Time Reset
Optimistic 358 1788 160 1385 157 1231 258 1425
Slice 306 867 120 793 112 796 259 1422
MWD 359 1792 164 1396 156 1251 204 1067
Time unit hour
75Optimistic
- Let the test runs execute until a DB reset is
really needed! - Optimistic R T1 T2 T3 T4
- If a test run T reports fail
- Reset the database and then rerun T
- Then, if T still reports failure? A real bug!
- Example
- Optimistic R T1 T2 T3 T4 R T4
ltT4 failuregt
76Optimistic
- Optimistic Record all failures (conflicts) to
avoid executing a test run twice, again - Test on Monday R T1 T2 T3 R T3 Tn
- ltT1 T2gt ? T3
- Test on Tuesday R T1 T2
- Test on Tuesday R T1 T2 R T3 Tn
(Next? T3?)
77Reordering Heuristics - Slice
- Slice sequence of test runs without conflicts
- Collect ltslicegts during each test
- Test Monday R T1 T2
- Slices ltT1 T2gt
ltT3 T4gt ltT5gt - Run test again?
- Reorder slices according to the conflicts
collected
T3 R
T3 T4
T5 R
T5
ltT5gt
ltT3 T4gt
ltT1 T2gt
78Test on Yesterday and Test on Today
Yesterday M1 R T1 T2 T3 R T3 M2 R T5 T6 R T6
T7 T8
79False Positive
- Case 1
- Buggy Application
- ? Tx Fails
- Consistent DB State
- Case 2
- Buggy Application
- ? Tx Success
- Inconsistent DB State
- The inconsistent DB helps the test run by
coincidence! - This a tradeoff between speed and nitpick accuracy