Title: Training Conditional Random Fields using
1Training Conditional Random Fields using Virtual
Evidence Boosting Lin Liao, Tanzeem Choudhury,
Dieter Fox, and Henry Kautz
University of Washington
Intel Research
Introduction
Goal To develop efficient feature selection and
parameter estimation technique for Conditional
Random Fields (CRFs) Application domain To learn
human activity models from continuous,
multi-modal sensory inputs
Approaches to Training Conditional Random Fields
(CRFs)
- Maximum Likelihood
- Run numerical optimization to find the optimal
weights, which requires inference at each
iteration - Inefficient for complex structures
- Inadequate for continuous observations and
feature selection
- Maximum Pseudo-Likelihood
- Convert a CRF into separate patches each
consists of a hidden node and true values of
neighbors - Run ML learning on separate patches
- Efficient but may over-estimate inter-dependency
- Inadequate for continuous observations and
feature selection
- Our Approach Virtual Evidence Boosting
- Convert a CRF into separate patches each
consists of a hidden node and virtual evidence of
neighbors - Run boosting (to select features) and belief
propagation (to update virtual evidence)
alternately - Efficient and unified approach to feature
selection and parameter estimation - Suitable for both discrete and continuous
observations
Extension of LogitBoost with Virtual Evidence
Virtual Evidence Boosting for CRFs
- Traditional boosting algorithms assume feature
values be deterministic - We extend LogitBoost algorithm to handle virtual
evidence, i.e., a feature could also be a
likelihood value or probability distribution
INPUTS Structure of CRF and training samples
OUTPUT F (linear combination of features) FOR
each iteration Run BP using current F to get
virtual evidence ve(xi, n(yi)) FOR each
sample Compute likelihood Compute sample
weight Compute working response END
Obtain best weak learner by solving Add the
weak learner to F END
INPUTS training samples OUTPUT F (linear
combination of features) FOR each iteration FOR
each sample Compute likelihood Compute
sample weight Compute working response
END Obtain best weak learner by solving
Add the
weak learner to F END
- Boosted Random Fields versus VEB
- Closest related work to VEB is Boosted Random
Fields (Torralba 2004) - BRFs combine boosting and belief propagation but
assume dense graph structure and weak pair-wise
influence - We compare the two as the pair-wise influence
changes - VEB performs significantly better with strong
relations
Application Human Activity Recognition Model
human activities and select discriminatory
features from multimodal sensor data. Sensors
include accelerometer, audio, light,
temperature, etc.
- Indoor Activities
- Activities computer usage, meal, TV, meeting,
and sleeping - Linear chain CRF with 315 continuous input
features - 1100 minutes of data over 12 days
Physical Activities and Spatial Contexts
Feature Selection VEB can be used to extract
sparse structure from complex models. In this
experiment it is able to find the exact order in
a high-order HMM, and thus outperforms other
learning alternatives.
- Context indoors, outdoors, and vehicles
- Activities stationary, walking, running,
driving, and going up/down stairs - Approximately 650 continuous input features
- 400 minutes of data over 12 episodes