Title: Near-Optimal Sensor Placements in Gaussian Processes
1Near-Optimal Sensor Placements in Gaussian
Processes
Carlos Guestrin Andreas Krause Ajit
Singh Carnegie Mellon University
2Sensor placement applications
- Monitoring of spatial phenomena
- Temperature
- Precipitation
- Drilling oil wells
- ...
- Active learning, experimental design, ...
- Results today not limited to 2-dimensions
Temperature data from sensor network
3Deploying sensors
Chicken-and-Egg problem ?
Considered in Computer science (c.f.,
Hochbaum Maass 85) Spatial statistics (c.f.,
Cressie 91)
No data or assumptions about distribution
But, what are the optimal placements??? i.e.,
solving combinatorial (non-myopic) optimization
Dont know where to place sensors
4Strong assumption Sensing radius
Problem is NP-complete But there are good
algorithms with (PTAS) ?-approximation
guarantees Hochbaum Maass 85
Node predicts values of positions with some radius
Unfortunately, approach is usually not useful
Assumption is wrong on real data! ? For example
5Spatial correlation
6Complex, noisy correlations
- Complex, uneven sensing region
- Actually, noisy correlations, rather than sensing
region
7Combining multiple sources of information
Temp here?
- Individually, sensors are bad predictors
- Combined information is more reliable
- How do we combine information?
- Focus of spatial statistics
8Gaussian process (GP) - Intuition
GP Non-parametric represents
uncertainty complex correlation functions
(kernels)
y - temperature
x - position
9Gaussian processes
Posterior mean temperature
Posterior variance
Kernel function
Prediction after observing set of sensors A
10Gaussian processes for sensor placement
Posterior mean temperature
Posterior variance
Goal Find sensor placement with least
uncertainty after observations
Problem is still NP-complete ? Need approximation
11Non-myopic placements
- Consider myopically selecting
- This can be seen as an attempt to non-myopically
maximize
H(A1)
H(A2 A1)
... H(Ak A1 ... Ak-1)
12Entropy criterion (c.f., Cressie 91)
- A Ã
- For i 1 to k
- Add location Xi to A, s.t.
Entropy places sensors along borders
Uncertainty (entropy) plot
Entropy criterion wastes information OHagan
78, Indirect, doesnt consider sensing region
No formal non-myopic guarantees ?
13Proposed objective functionMutual information
- Locations of interest V
- Find locations AµV maximizing mutual information
- Intuitive greedy rule
Intuitive criterion Locations that are both
different and informative We give formal
non-myopic guarantees ?
14An important observation
Selecting T1 tells sth.about T2 and T5
Selecting T3 tells sth.about T2 and T4
In many cases, new information is worth less if
we know more (diminishing returns)!
T2
T1
T3
T5
T4
Now adding T2 would not help much
15Submodular set functions
- Submodular set functions are a natural formalism
for this idea -
- f(A X) f(A)
- Maximization of SFs is NP-hard ?
- But
f(B X) f(B) for A µ B
B
A
X
16How can we leverage submodularity?
- Theorem Nemhauser et al. 78 The greedy
algorithm guarantees (1-1/e) OPT approximation
for monotone SFs, i.e.
17How can we leverage submodularity?
- Theorem Nemhauser et al. 78 The greedy
algorithm guarantees (1-1/e) OPT approximation
for monotone SFs, i.e.
18Mutual information and submodularity
- Mutual information is submodular ?
- F(A) I(AV\A)
- So, we should be able to use Nemhauser et al.
- Mutual information is not monotone!!! ?
- Initially, adding sensor increases MI later
adding sensors decreases MI - F() I(V) 0
- F(V) I(V) 0
- F(A) 0
Even though MI is submodular, cant apply
Nemhauser et al. Or can we ?
19Approximate monotonicity of mutual information
- If H(XA) H(XV\A) 0, then MI monotonic
- Solution Add grid Z of unobservable locations
- If H(XA) H(XZ?V\A) 0, then MI monotonic
H(XA) ltlt H(XV\A) MI not monotonic
For sufficiently fine Z H(XA) gt H(XZ?V\A) -??
? MI approximately monotonic
X
20Theorem Mutual information sensor placement
- Greedy MI algorithm provides constant factor
approximation placing k sensors, 8 ?gt0
21Different costs for different placements
Theorem 1 Constant-factor approximation of
optimal locations select k sensors
- Theorem 2 (Cost-sensitive placements)
- In practice, different locations may have
different costs - Corridor versus inside wall
- Have a budget B to spend on placing sensors
- Constant-factor approximation same constant
(1-1/e) - Slightly more complicated than greedy algorithm
Sviridenko / Krause, Guestrin
22Deployment results
Model learned from 54 sensors
True temp. prediction
True temp. variance
Mutual information has 3 times less variance than
entropy criterion
Used initial deployment to select 22 new sensors
Learned new GP on test data using just these
sensors
23Comparing to other heuristics
- Greedy
- Algorithm we analyze
- Random placements
- Pairwise exchange (PE)
- Start with a some placement
- Swap locations while improving solution
- Our bound enables a posteriori
- analysis for any heuristic
- Assume, algorithm TUAFSPGP gives results which
are 10 better than the results obtained from the
greedy algorithm - Then we immediately know, TUAFSPGP is within 70
of optimum!
Better
mutual information
24Precipitation data
Better
Entropy criterion
Mutual information
Entropy Mutual information
25Computing the greedy rule
At each iteration For each candidate position i
21,,N, must compute
Requires inversion of NxN matrix about O(N3)
Total running time for k sensors O(kN4)
Polynomial! But very slow in practice ?
Exploit sparsity in kernel matrix
26Local kernels
- Covariance matrix may have many zeros!
- Each sensor location correlated with a small
number of other locations - Exploiting locality
- If each location correlated with at most d others
- A sparse representation, and a priority queue
trick - Reduce complexity from O(kN4) to
- Only about O(N log N)
Usually, matrix is only almost sparse
27Approximately local kernels
- Covariance matrix may have many elements close to
zero - E.g., Gaussian kernel
- Matrix not sparse
- What if we set them to zero?
- Sparse matrix
- Approximate solution
- Theorem
- Truncate small entries ! small effect on solution
quality - If K(x,y) ?, set to 0
- Then, quality of placements only O(?) worse
28Effect of truncated kernels on solution Rain
data
Improvement in running time
Better
29Summary
- Mutual information criterion for sensor placement
in general GPs - Efficient algorithms with strong approximation
guarantees (1-1/e) OPT-e - Exploiting local structure improves efficiency
- Superior prediction accuracy for several
real-world problems - Related ideas in discrete settings presented at
UAI and IJCAI this year
Effective algorithm for sensor placement and
experimental design basis for active learning
30A note on maximizing entropy
- Entropy is submodular Ko et al. 95, but
- Function F is monotonic iff
- Adding X cannot hurt
- F(AX) F(A)
- Remark
- Entropy in GPs not monotonic (not even
approximately) - H(AX) H(A) H(XA)
- As discretization becomes finer H(XA) ! -1
Nemhauser et al. analysis for submodular functions
not applicable directly to entropy
31How do we predict temperatures at unsensed
locations?
Interpolation?
temperature
position
32How do we predict temperatures at unsensed
locations?
Regression
Few parameters, less overfitting ?
How sure are we about the prediction?
y - temperature
x - position
But, regression function has no notion of
uncertainty!!! ?