Software Analysis via Data Analysis

About This Presentation

Title:

Software Analysis via Data Analysis

Description:

2006/04/04. 1. DIMACS EAA:ICS. Franc Brglez. Raleigh, NC, USA. Software Analysis via Data Analysis ... Problems are NP-hard; require heuristics or (worst case) ... – PowerPoint PPT presentation

Number of Views:22

Avg rating:3.0/5.0

Slides: 18

Provided by: matthi162

Learn more at: http://archive.dimacs.rutgers.edu

Category:

more less

Transcript and Presenter's Notes

Title: Software Analysis via Data Analysis

1
Software Analysis via Data Analysis
Matthias F. Stallmann, 2006/04/04, DIMACS EAAISC
planning meeting

Xiao Yu Li
Seattle, WA, USA

Franc Brglez Raleigh, NC, USA
Based on joint work with
2
Software versus Algorithms

Problems are NP-hard require heuristics or
(worst case) exponential algorithms.
Simple algorithms must be compared with
cplex and other ILP solvers
metaheuristics (SA, GA, particle swarm,
ants, etc.) with lots of adjustable parameters
Want black box comparison with (in many cases)
no prior understanding of (some of) the algorithms

3
Data Presentationsbased on CPLEX runs for
(permutations of) a single instance

Descriptive Statistics
mean / median / stdev
600.4 / 25.3 / 1767.3
Histogram
Percent solved

4
Stretching the Truthor making it clearer?

more information here, but we need to look
carefully

5
A more normal distributionCPLEX with different
settings under the same conditions
6
uf250..87 QT2/QT1 vs UW2/UW1 (1)
What about random 3-SAT instances?
exp. d. 16.7/17.2
solvability
runtime (seconds)
7
uf250..87 QT2/QT1 vs UW2/UW1 (2)
UW1 performs the same as QT1 (t-test t 1.88 gt
1.97)
exp. d. 16.7/17.2
exp. d. 12.3/12.5
solvability
runtime (seconds)
8
uf250..87 QT2/QT1 vs UW2/UW1 (3)
UW1 performs the same as QT1 (t-test t 1.88 gt
1.97)
UW2 outperforms UW1 by a factor of 31 ...
exp. d. 0.39/0.29
9
uf250..87 QT2/QT1 vs UW2/UW1 (4)
UW1 performs the same as QT1 (t-test t 1.88 gt
1.97)
UW2 outperforms UW1 by a factor of 31 ...
QT2 outperforms UW2 slightly (t-test t 2.24 gt
1.97)
exp. d.
exp. d.
0.31/0.28
exp. d. 0.39/0.29
exp. d. 0.21/0.19
10
Sources of Data Distributionsfirst, 100 random
instances (SAT)
median 4.8
mean 21.2
stdev 42.9
Heavy tail
11
Not all instances are equalheres an easy one
(128 permutations original)
median 4.4
mean 6.7
stdev 6.5
Exponential
12
and heres a harder one
median 84.8
mean 126.7
stdev 117.6
Exponential
13
Another wrinkle stochastic searchfirst, same
seed and 32 permuted instances original
number of flips
median 42484
mean 62457
stdev 86551
slightly worse than Exponential
14
versus 33 different seeds same distribution?
number of flips
median 36637
mean 50185
stdev 83226
slightly worse than Exponential
15
Things get strange when the solver is not
completely stochastic (e.g. BB with stochastic
search)
Bi-modal Stochastic search either finds optimum
at root, or at first branch. Lower bound finds
optimum at root. No randomness in LB method.
16
Lower bound method is extremely sensitive to
input ordering
Heavy tail (and one instance times out) Lower
bound method finds optimum at root or in an early
branch only if input order is friendly.
17
Another Data Analysis Application Instance
Profiling