Title: Actively Learning LevelSets of Composite Functions
1Actively Learning Level-Sets of Composite
Functions
- Brent Bryan, Jeff Schneider
2Motivation Statistical Analysis
Models may be expensive to compute!
CMB Model
WMAP Data (Astier et al. 2006)
?, ?M, ??, ?B, ?DM, ns, f?, b
?
Supernova Model
Supernova Data (Davis et al. 2007)
LSS Model
LSS Data (Tegmark et al. 2006)
Goal Minimal jointly valid 1-? confidence
regions for parameters
3Motivation Statistical Analysis
- Many ways to combine p-values
- Bonferronois method, inverse normal, inverse
logit - Fishers Method (Fisher 1932)
-
- where C is the critical value of a ?22m
distribution -
-
4The Level-Set Problem
Where does f(x) 11?
5The Level-Set Problem
Where does f(x) 11?
True Boundary
t
Could pick points randomly, or uniformly
Predicted Boundary
6The Level-Set Problem
Where does f(x) 11?
t
Straddle heuristic works best (Bryan et al.
2005)
- Could try
- entropy,
- variance,
- misclassification
- probability,
- etc.
7The Level-Set Problem
t
Mix variance and entropy.
8The Level-Set Problem
But what if we use more information?
t
Mix variance and entropy.
9The Level-Set Problem
But what if we use more information?
t
10The Level-Set Problem
But what if we use more information?
f
f1
t
But, we want to minimize samples
f2
f3
11The Level-Set Problem
Is f(x) t?
f
How can we take advantage of this intuition?
f1
t
This sample gives full information!
But, we want to minimize samples
f2
f3
12Level-Set Problem Summary
- Multiple Function Case
- Only sample one fi
- Dont expect to reduce the variance by
but - A better estimate of the knowledge gained is
-
- Single Function Case
- Use straddle heuristic to balance exploration and
exploitation -
- Mimics information gain
13Algorithm Outline
- Possible Heuristics
- random
- variance
- combined-straddle
- Var-MaxVarStraddle
generatecandidates
parameter space, ?
compute fi(x)
choose x, fi
One for each fi
One for each fi
142D Example
Use colors to denote samples
15Possible Sampling Heuristics
red lines predicted level-set
blue lines true level-set
16Experimental Results
Target function is the composite of 2 observable
functions
Target function is the composite of 4 observable
functions
17Application Cosmology
Models may be expensive to compute!
CMB Model
WMAP Data (Astier et al. 2006)
?, ?M, ??, ?B, ?DM, ns, f?, b
Supernova Model
Statistical Test p-value
hypothetical x
Supernova Data (Davis et al. 2007)
LSS Model
LSS Data (Tegmark et al. 2006)
Goal Minimal jointly valid 1-? confidence
regions for parameters
18Application Cosmology
- Supernova
- x H0, ?M, ??
- x1 65, 0.23, ?
- ? ?? s.t. p(x) a
Create plot by gridding samples
Cant test all points!
95 ?2 confidence regions from supernova based on
Davis et al. (2007) data
19Application Cosmology
Conservative estimate
?x in X ?x s.t. x in square and p(x) a ?
Square included if any cell has x such that p(x)
a
20Application Cosmology
CMB
Supernova
Large Scale Structure
Combined
1.2 billion samples on uniform grid
3 million samples using Var-MaxVar Straddle
21Conclusions
- Extended Straddle algorithm to multiple datasets
- Showed that combining p-values can be written as
the sum of observable functions - Deriving confidence regions this way
- Results in smaller regions (than intersection of
marginals) - Is much more sample efficient than uniform
sampling