Title: Applications of Active Learning in Sensor Network
1Applications of Active Learning in Sensor Network
- 11-785 Active Learning Seminar
- November 13, 2008
2Road Map
- Paper 1. Active Learning Driven Data Acquisition
for Sensor Networks (Muttreja, Raghunathan, Ravi
and Niraj, ISCC 2006) - A sampling strategy for sensor networks based on
predictive modeling technique. - Paper 2. Cost-effective Outbreak Detection in
Networks (Leskovec, Krause, Guestrin, Faloutsos,
VanBriesen, Grance, KDD07) - Optimal subset selection algorithm applied to
sensor placement.
3Active Learning Driven Data Acquisition for
Sensor Network
- (Muttreja, Raghunathan, Ravi and Niraj, ISCC 2006)
4What is this paper about?
- Task Given a sensor network to monitor some
physical phenomenon (temperature, air pressure,
precipitate, etc), decide when to query which
sensor. - This problem is called data acquisition policy
design. - Approach Construct a probabilistic model over
the network, and have it guide the sampling
decision.
5Why is this relevant to Active Learning?
- Think of the untapped sensor nodes as a pool of
possible data points one can get the true answer
(label) of at some cost then the task here is an
analogue of sampling policy in Active Learning - In fact, many of the model driven approaches are
studied from an active learning perspective - Model-driven data acquisition in sensor network
(Deshpande,Guestrin, et al 2004) - Using probabilistic models for data management
in acquisitional environments (Deshapande,
Guestrin, Maddenn 2005)
6Sensor Network
- A wireless network consisting of spatially
distributed autonomous devices using sensors to
monitor physical or environmental conditions,
such as temperature, sound, vibration, pressure,
etc - Severe energy constraint.
- Hence data acquisition policy is critical.
7Key assumption
- The data field is spatio-temporally correlated -
i.e., measured at one sensor node is correlated
to its neighbor node, or the measurements at near
future/past. - Certain level of distortion is acceptable - i.e.,
users are often interested in only the
approximation of the data field.
8Model-driven data collection
- Given those assumptions, how do you design data
acquisition policy? - Solution presented here is caled predictive
modeling technique. - In a nutshell maintain a model of the data
field, have it decide which one to query at each
cycle.
9Model-driven data collection
- Procedure
- Build a probabilistic model of the data field
based on currently known readings. - At each cycle T
- At each node, compute the predictive measurement
and the confidence level. - Sample a node if the confidence dips below some
threshold - Recalibrate the model with the new data
- Repeat the process till all node meet the
threshold. - Repeat 2 at all cycle
10Model-driven data collection
- Which model ?
- Some criterion
- As general as possible
- Output probabilistic estimate
- Incrementally trainable
- Provide efficient way to assess confidence of
the candidate data point without actually
measuring the reading at the point - Gaussian Process (GP) is a popular choice
- http//www.gaussianprocess.org/
11Gaussian Process (GP) for sensor network modeling
- Gaussian Process is a nonparametric version of
Gaussian distribution, extending multivariate
Gaussian to infinite dimensionality. - Sensor network felt natural as a Gaussian process
- Each node at each timeslot is a Gaussian r.v
- True measurement as a mean
- Distortion as variance.
12Gaussian Process for sensor network modeling
- If the the data field is a Gaussian Process
- Then, the collection of random variables y y1
y2 yi (i.e., reading from the sensors) is a
joint Gaussian distribution given a set of sensor
location. - Consider each node is indexed by xi ltnode,
timegt tuple, and xi x1 x2 is any set of
such nodes, then y is a mulitivairiate Gaussian
r.v like this
C Covariance matrix set by Covariance Function
C(x, x ?) ยต Mean Function
13Gaussian Process for sensor network modeling
- Given N training points, the model will make a
inference on random variable y for a new node
location (Derivation in tutorial from the
website.)
CN is Covariance matrix of the training data
points, computed via Covariance Function C, k is
a covariance between the new node and all node in
the training
14Confident measurement
- Given GP,
- And one more leap of mind Maximizing confidence
equivalent to minimizing variance. - The task seems really simple
- The simplest greedy heuristic go compute
variance at all the nodes, and chose the one with
most variance to query next. (used often in
reality) - It may not be globally optimal.
15Problems with the model 1
- Greedy search may not be optimal.
- The author proposed a heuristic distribution of
interest. Basically adding some weights to the
sensor who historically recorded unpredictable
change - Define relative error and distribution of
interest
Weight the confidence score with the interest
16Problems with the model 2
- Gaussian Process is not incrementally trainable
- Each time the data set change, Need to recompute
and invert covariance matrix. - Proposed solution Sparse approximation for GP
Regression - Idea Find the bases of that spanes the
covariance matrix - The paper does not provide detail on this
technique.
17Problems with the model 2
- Express matrix in terms of the weights and the
base vector (compact representation) - Once bases are found, the computation is almost
the same
CM is Covariance matrix of the basis vectors, W
is a vector of weights that is estimated during
the trainig. k is a covariance between the new
node and basis vectors
18Experiment
- Experiment on the simulated data.
- Two sets of experiments Centralized data
aggregation and Clustered data aggregate
(difference in terms of energy efficiency) - Over 100 time unit, Models history length 5
- Two level of confidence threshold, ? 0.1 and
0.2 - Used fixed Covariance Matrix
Note w1 and w2 were NOT learned during the
experiment but were tuned during the development
phase. Others suggest learning the Covariance
Parameters as well. See David MacKays tutorial
from http//www.gaussianprocess.org/
19Experiment
- Root mean square error averaged over all the
cycles. - Baseline method queries all nodes at each cycle
(hence no error) - The proposed model demonstrates the significant
energy savings.
20Summary of the paper
- Gaussian Process is a nice framework to use for
node selection in sensor network. - Are there more principled way to search for
optimal subset? (proposed solution looks a bit ad
hoc) - How to chose covariance matrix is not clear.
- How is the accuracy performance compared to
competing framework?
21Summary of the paper
- Many details of the Gaussian Process are missing
from this paper. See the following for more
through discussion - More through discussion on GP regression
- Gaussian process. Tutorial (MacKay, 1998)
- Sparse Approximation
- Gaussian process Iterative Sparse
Approximation (Csato 2002) - Sampling strategy using GP
- Model-driven data acquiring in sensor network
(Deshpande,Guestrin, et al 2004) - Using probabilistic models for data management
in acquisitional environments (Deshapande,
Guestrin, Maddenn 2005)