Title: Graphical Models in Data Association Problems
1Graphical Models in Data Association Problems
Alexander Ihler UC Irvine
ihler_at_ics.uci.edu
Collaborators Sergey Kirshner Andrew
Robertson Padhraic Smyth
2Outline
- Graphical models
- Convenient description of structure among random
variables - Use this structure to
- Organize inference computations
- Finding optimal (ML, etc.) estimates
- Calculate data likelihood
- Simulation / drawing samples
- Suggest sub-optimal (approximate) inference
computations - e.g. when optimal computations too expensive
- Some examples from data association
- Markov chains, Kalman filtering
- Rainfall models
- Mixtures of trees
- Loopy graphs
- Image analysis (de-noising, smoothing, etc.)
3Graphical Models
An undirected graph is defined by
set of nodes
set of edges connecting nodes
Nodes are associated with random
variables
4Graphical ModelsFactorization
- Sufficient condition
- Distribution factors into product of potential
functions defined on cliques of G - Condition also necessary if distribution strictly
positive - Examples
5Graphical ModelsInference
- Many possible inference goals
- Given a few observed RVs, compute
- Marginal distributions
- Joint, Maximum a-posteriori (MAP) values
- Data likelihood of observed variables
- Samples from posterior
- Use graph structure to do computations
efficiently - Example compute posterior marginal p(x2 x5X5)
6Finding marginals via Belief Propagation
(aka sum-product other goals have similar
algorithms)
Combine the observations from all nodes in the
graph through a series of local message-passing
operations
neighborhood of node s (adjacent nodes)
message sent from node t to node s (sufficient
statistic of ts knowledge about s)
7BP Message Updates
I. Message Product Multiply incoming messages
(from all nodes but s) with the local observation
to form a distribution over
II. Message Propagation Transform distribution
from node t to node s using the pairwise
interaction potential
8Example sequential estimation
- Well-known example
- Markov Chain
- Jointly Gaussian uncertainty
- Gives integrals a simple, closed form
- Optimal inference (in many senses) given by
Kalman filter - Convert large (T) problem to collection of
smaller problems - exact non-Gaussian particle ensemble
filtering extensions - Same general results hold for any tree-structured
graph - Partial elimination ordering of nodes
- Complexity limited by dimension of
- each variable
9Exact estimation in non-trees
- Often our variables arent so well-behaved
- May be able to convert using variable
augmentation - Often the case in Bayesian parameter estimation
- Treat parameters as variables, include them in
the graph - (increases nonlinearities!)
- But, dimensionality problem
- Computation increases (maybe a lot!)
- Jointly Gaussian, d3
- Otherwise often exponential in d
- Can trade off graph complexity with
dimensionality
a
10Example rainfall data
- 41 stations in India
- Rainfall occurrence
- amounts for 30 years
- Some stations/days missing
- Tasks
- Impute missing entries
- Simulate realistic rainfall
- Short term predictions
-
- Cant deal with joint distribution too large to
even manipulate - Conditional independence structure?
- Unlikely to be tree-structured
11Example rainfall data
- True relationships
- not tree-like at all
- High tree-width
- Need some approximations
- Approximate model,
- exact inference
- Correct model,
- approximate inference
- Even harder
- May get multiple observation
- modalities (satellite data, etc.)
- Have own statistical structure
- relationships to stations
-
12Example rainfall data
- Consider a single time-slice
- Option 1 mixtures of trees
- Add hidden variable indicating which of several
trees - (Generally) marginalize over this variable
- Option 2 use loopy graph, ignore loops in
inference - Utility depends on task
- Works well for filling in missing data
- Perhaps less well for other tasks
13Multi-scale models
- Another example of graph structure
- Efficient computation if tree-structured
- Again, dont really believe any particular tree
- Perhaps average over (use mixture of) several
- (see e.g. Willsky 2002)
- (also w/ loops,
- similar to multi-grid)
14Summary
- Explicit structure among variables
- Prior knowledge / learned from data
- Structure organizes computation, suggests
approximations - Can provide computational efficiency
- (often naïve distribution too large to represent
/ estimate) - Offers some choice
- Where to put the complexity?
- Simple graph structure with high-dimensional
variables - Complex graph structure with more manageable
variables - Approximate structure, exact computations
- Improved structures, approximate computations