Title: Information Theoretic Approaches to Sensor Management
1Information Theoretic Approaches to Sensor
Management
- Presented by Daniel Sadoc Menasche
- and
- Ramin Khalili
2References
- Chapter 3 of Foundations and Applications of
Sensor Management, Hero and Kreucher - Information Theory, Cover Thomas (chapter 11)
- Wireless Sensor Networks An Information
Processing Approach, Zhao and Guibas - Utility Based Decision Making in Wireless Sensor
Nets, Byers and Nasser
3Outline
- Overview motivation and goals
- Background entropy, conditional entropy and
information divergence - Information optimal policy search
- Near universal proxy
- More examples
- Multitarget tracking
- Terrain classification
4Motivation
- Problem monitoring toxicity in an area in which
hazardous materials are used - Deployment is a one-time operation
- The role of the nodes is dynamic
- Nodes can
- Sense data (S)
- Relay data (R)
- Sense and relay (S/R)
5idle
relay
sense
Base station
6idle
relay
sense
Base station
Toxic material
7idle
relay
sense
Base station
Toxic material
8idle
relay
sense
Base station
Uncertainty region
Toxic material
9Bad choiceuncertainty region is equal
idle
relay
sense
Base station
Uncertainty region
Toxic material
10Good choice
idle
relay
sense
Base station
Uncertainty region
Toxic material
11idle
relay
sense
Base station
Uncertainty region
Toxic material
12A Signal Processing Perspective on Information
Theory
- data processing inequality processing of
information does not increase (but may decrease)
information carried by a signal - However, processing (e.g., feature extraction
from an image) may be required due to
computational constraints. - information cant hurt observations are never
harmful - However, the operations of sensing and
transmission of data have costs. Sensor networks
have energy constraints. What to sense?
13Utility
Number of nodes
Marginal utility
Law of diminishing marginal utility
Number of nodes
14Canonical Problem
- What is the utility function?
- This is the key problem addressed by this chapter!
15Goal
- Optimizing information collection capability of a
sensor network - independent tasks (layers) information
collection and risk/reward optimization
Risk/reward optimization (e.g., for estimation
or detection) Information collection
Mission specific
Mission independent
16Goal
- Optimizing information collection capability of a
sensor network - independent tasks (layers) information
collection and risk/reward optimization
Risk/reward optimization (e.g., for estimation
or detection) Information collection
Layer 2
Layer 1
Cross layer optimization in many case is
unnecessary, if we use information theory to
guide layer 1 !
17Remark
- The authors propose many information collection
strategies. - But they do not discuss energy related issues,
e.g. - The impact of routing
- assume that all nodes can access the base station
in one hop - the processing costs to implement the different
strategies suggested in the paper - assume that the nodes have enough memory and
computational resources, and that the energy
consumed by the CPU is low
18Sensor Selection Problem
Sensor 1
Uncertainty region
Sensor 2
New uncertainty region 2
New uncertainty region 1
- Entropy related to Volume of Uncertainty Region
- Fisher Information related to Area of
Uncertainty Region
19Volume of Uncertainty Region and Entropy
- Differential entropy
- Typical set set with high probability
20Volume of Uncertainty Region and Entropy
- Typical set smallest set that contains almost
all the probability - Entropy the logarithm of the side of the
typical set - Low entropy random variable is confinedto a
small volume. - High entropy R.V. is dispersed.
21Information Utility Measures
- Quantify information brought by a sensor we need
to define a measure of information utility - Information content or utility inverse of size
of the uncertainty region of the estimate of x - What is size of information region? Possible
answers - Entropy
- KL divergence
- Chernoff Information
- Fisher Information
22Background Entropy and Conditional Entropy
- H(S) entropy (prior uncertainly)
- discrete ? pS(s) log(pS(s))
- continuous ? ?S(s) log(?S(s)) ds
- H(SY) conditional entropy (posterior
uncertainly) - discrete ? pSY(sy) log(pSY(sy))
- continuous (could be negative) ? ?SY(sy)
log(?SY(sy)) ds - ?H(SY) reduction in uncertainty (always
positive) - I(X,Y) H(S)- H(SY)
- KL(pq) pseudo-distance of two candidate
distribution (pq) of S - ? pP(s) log(pP(s)/pQ(s))
23Given the state of the world, what to expect from
the measurements? This is assumed to be known.
Information brought by the measurement
H(Y)
H(X)
I(X,Y)
Initial uncertainty
H(XY)
H(YX)
Uncertainty after measurement
X real state of the world Y measurement
24information doesnt hurt I(X,Y) is always
positive or zero!
Discrete always Continuous may be -
Discrete always Continuous may be -
I(X,Y)
H(XY)
H(YX)
H(X)
X real state of the world Y measurement
25Generalizing Entropy, Conditional Entropy and
Divergence
- ?-entropy, ?-conditional entropy, and
?-information divergence - Small ? allows to stress tails of distributions
(i.e., minor differences between distributions)
26?-entropy
27Volume of Support of Uncertainty Region and
a-entropy
28?-conditional entropy and ?-information
divergence
29Information Driven Sensor Querying
- Let us define the utility U of d measurements as
being any function such that (3 possibilities
will be considered in the following slides) - If we have d measurements, and we want to choose
a sensor to gather the measurement d1, which one
to peek? Choose the one that maximizes the
following function -
This seems to be a circular argument! To decide
thenext best sensor we need to know the
measurement thatit will generate?!?!
30Information Driven Sensor Querying
This seems to be a circular argument! To decide
thenext best sensor we need to know the
measurement thatit will generate?!?!
- Answer for each sensor, consider all the
possible values that it may generate. Each of
them leads to a utility. To summarize the set
of utilities into a singleutility, consider
either - average (used in Section 6.2),
- best or
- worst.
31Entropy, MI, Fisher Info,
Utility Gain Utility Gain Utility Gain
Node Measure result Measure result
1 0
A 10 1
B 5 2
C 1 10
D 3 5
E 3 5
Base station
A
E
B
D
C
Toxic material
32Utility Gain Utility Gain Utility Gain Metric Metric Metric
Node Measurement result Measurement result Max Min Average
1 0
A 10 1 10 1 5.5
B 5 2 5 2 3.5
C 1 11 11 1 6
D 3 5 5 3 4
E 4 5 5 4 4.5
33Initalization
Leader
Yes
No
Good belief?
Wait for request
Select sensor
End
Sense
Wait for reply
Update belief
34Information Utility Gain Measure I Mutual
Information
- Captures the usefulness of a given measure
- It can be interpreted at the KL-divergence
between the belief after and before applying new
measurement
35Information Utility Gain Measure II Chernoff
Information (Information Divergence)
- Captures the usefulness of measurements for
detection purposes - Example
- Hypothesis 1 target detected (S0)
- Hypothesis 2 target not detected (S1)
- The probability of error (Pe) is given by
- P(H0S0) p0 P(H1S1) p1 p1 and p2 are
priors - As new measurements come in..
- rate at which log Pe -gt 0 Chernoff Information.
36Chernoff Information
37Chernoff Information
38Chernoff Information
39Information Utility Gain Measure III Fisher
Information
- Captures the usefulness of a given measure for
estimation purposes - It is a lower bound for the inverse of the mean
square error.
40Fisher Information
41Relating Information Utility Measures II and III
Chernoff Info Fisher Info
Used for Detection Estimation
Is the Rate at which Log(Pe) goes to 0 Lower bound on MSE
42Relating Information Utility Measures II and III
- If the signal is weak, i.e.,
- We have to detect the value of a signal that
switches between 0 and delta or - We have to estimate the value of a signal that
ranges between 0 and delta - then maximizing Fisher information is equivalent
to maximizing Chernoff information!
43Relating Information Utility Measures II and III
If Delta is small, maximizing D maximizes F(s) !
44To be continued .