Applications of Active Learning in Sensor Network

About This Presentation

Title:

Applications of Active Learning in Sensor Network

Description:

Gaussian Process (GP) for sensor network modeling: ... Proposed solution: Sparse approximation for GP Regression ... Sampling strategy using GP ' ... – PowerPoint PPT presentation

Number of Views:52

Avg rating:3.0/5.0

Slides: 22

Provided by: tyy

Category:

more less

Transcript and Presenter's Notes

Title: Applications of Active Learning in Sensor Network

1
Applications of Active Learning in Sensor Network

11-785 Active Learning Seminar
November 13, 2008

2
Road Map

Paper 1. Active Learning Driven Data Acquisition
for Sensor Networks (Muttreja, Raghunathan, Ravi
and Niraj, ISCC 2006)
A sampling strategy for sensor networks based on
predictive modeling technique.
Paper 2. Cost-effective Outbreak Detection in
Networks (Leskovec, Krause, Guestrin, Faloutsos,
VanBriesen, Grance, KDD07)
Optimal subset selection algorithm applied to
sensor placement.

3
Active Learning Driven Data Acquisition for
Sensor Network

(Muttreja, Raghunathan, Ravi and Niraj, ISCC 2006)

4
What is this paper about?

Task Given a sensor network to monitor some
physical phenomenon (temperature, air pressure,
precipitate, etc), decide when to query which
sensor.
This problem is called data acquisition policy
design.
Approach Construct a probabilistic model over
the network, and have it guide the sampling
decision.

5
Why is this relevant to Active Learning?

Think of the untapped sensor nodes as a pool of
possible data points one can get the true answer
(label) of at some cost then the task here is an
analogue of sampling policy in Active Learning
In fact, many of the model driven approaches are
studied from an active learning perspective
Model-driven data acquisition in sensor network
(Deshpande,Guestrin, et al 2004)
Using probabilistic models for data management
in acquisitional environments (Deshapande,
Guestrin, Maddenn 2005)

6
Sensor Network

A wireless network consisting of spatially
distributed autonomous devices using sensors to
monitor physical or environmental conditions,
such as temperature, sound, vibration, pressure,
etc
Severe energy constraint.
Hence data acquisition policy is critical.

7
Key assumption

The data field is spatio-temporally correlated -
i.e., measured at one sensor node is correlated
to its neighbor node, or the measurements at near
future/past.
Certain level of distortion is acceptable - i.e.,
users are often interested in only the
approximation of the data field.

8
Model-driven data collection

Given those assumptions, how do you design data
acquisition policy?
Solution presented here is caled predictive
modeling technique.
In a nutshell maintain a model of the data
field, have it decide which one to query at each
cycle.

9
Model-driven data collection

Procedure
Build a probabilistic model of the data field
based on currently known readings.
At each cycle T
At each node, compute the predictive measurement
and the confidence level.
Sample a node if the confidence dips below some
threshold
Recalibrate the model with the new data
Repeat the process till all node meet the
threshold.
Repeat 2 at all cycle

10
Model-driven data collection

Which model ?
Some criterion
As general as possible
Output probabilistic estimate
Incrementally trainable
Provide efficient way to assess confidence of
the candidate data point without actually
measuring the reading at the point
Gaussian Process (GP) is a popular choice
http//www.gaussianprocess.org/

11
Gaussian Process (GP) for sensor network modeling

Gaussian Process is a nonparametric version of
Gaussian distribution, extending multivariate
Gaussian to infinite dimensionality.
Sensor network felt natural as a Gaussian process
Each node at each timeslot is a Gaussian r.v
True measurement as a mean
Distortion as variance.

12
Gaussian Process for sensor network modeling

If the the data field is a Gaussian Process
Then, the collection of random variables y y1
y2 yi (i.e., reading from the sensors) is a
joint Gaussian distribution given a set of sensor
location.
Consider each node is indexed by xi ltnode,
timegt tuple, and xi x1 x2 is any set of
such nodes, then y is a mulitivairiate Gaussian
r.v like this

C Covariance matrix set by Covariance Function
C(x, x ?) µ Mean Function
13
Gaussian Process for sensor network modeling

Given N training points, the model will make a
inference on random variable y for a new node
location (Derivation in tutorial from the
website.)

CN is Covariance matrix of the training data
points, computed via Covariance Function C, k is
a covariance between the new node and all node in
the training
14
Confident measurement

Given GP,
And one more leap of mind Maximizing confidence
equivalent to minimizing variance.
The task seems really simple
The simplest greedy heuristic go compute
variance at all the nodes, and chose the one with
most variance to query next. (used often in
reality)
It may not be globally optimal.

15
Problems with the model 1

Greedy search may not be optimal.
The author proposed a heuristic distribution of
interest. Basically adding some weights to the
sensor who historically recorded unpredictable
change
Define relative error and distribution of
interest

Weight the confidence score with the interest
16
Problems with the model 2

Gaussian Process is not incrementally trainable
Each time the data set change, Need to recompute
and invert covariance matrix.
Proposed solution Sparse approximation for GP
Regression
Idea Find the bases of that spanes the
covariance matrix
The paper does not provide detail on this
technique.

17
Problems with the model 2

Express matrix in terms of the weights and the
base vector (compact representation)
Once bases are found, the computation is almost
the same

CM is Covariance matrix of the basis vectors, W
is a vector of weights that is estimated during
the trainig. k is a covariance between the new
node and basis vectors
18
Experiment

Experiment on the simulated data.
Two sets of experiments Centralized data
aggregation and Clustered data aggregate
(difference in terms of energy efficiency)
Over 100 time unit, Models history length 5
Two level of confidence threshold, ? 0.1 and
0.2
Used fixed Covariance Matrix

Note w1 and w2 were NOT learned during the
experiment but were tuned during the development
phase. Others suggest learning the Covariance
Parameters as well. See David MacKays tutorial
from http//www.gaussianprocess.org/
19
Experiment

Root mean square error averaged over all the
cycles.
Baseline method queries all nodes at each cycle
(hence no error)
The proposed model demonstrates the significant
energy savings.

20
Summary of the paper

Gaussian Process is a nice framework to use for
node selection in sensor network.
Are there more principled way to search for
optimal subset? (proposed solution looks a bit ad
hoc)
How to chose covariance matrix is not clear.
How is the accuracy performance compared to
competing framework?

21
Summary of the paper

Many details of the Gaussian Process are missing
from this paper. See the following for more
through discussion
More through discussion on GP regression
Gaussian process. Tutorial (MacKay, 1998)
Sparse Approximation
Gaussian process Iterative Sparse
Approximation (Csato 2002)
Sampling strategy using GP
Model-driven data acquiring in sensor network
(Deshpande,Guestrin, et al 2004)
Using probabilistic models for data management
in acquisitional environments (Deshapande,
Guestrin, Maddenn 2005)

Write a Comment

User Comments (0)

About PowerShow.com

Applications of Active Learning in Sensor Network - PowerPoint PPT Presentation

Applications of Active Learning in Sensor Network

Gaussian Process (GP) for sensor network modeling: ... Proposed solution: Sparse approximation for GP Regression ... Sampling strategy using GP ' ... – PowerPoint PPT presentation