Title: What Did We See
1What Did We See? WikiGIS
- Chris Pal
- University of Massachusetts
- A Talk for Memex Day
- MSR Redmond, July 19, 2006
2Research Questions
- How do personal and community photo-journals and
blogs interact?Spectrum from personal blogs
community portals (blikis) Wiki articles (most
public) User Interface Social Computing
Research - Can we mine information in Blogs ?Find Blog
entries that look like Wiki entries, extract
information, encourage contributions?Document
and Text Processing Research - What is the role of computer vision for location
and object recognition?Can we use these methods
to provide the user with relevant information?
3Search Blogs and Wiki Entries
4Questions About Observations
5Search and Social Computing
I Discover that my friend Justin also found an
interesting mushroom
Have I been here as well?
6Object and Location Recognition
1. Object RecognitionFrom Images and Text
2. Location RecognitionFrom Images and Text
7Conditional Random Fields
- Information Extraction Example
Named Entities (SFSM states) Binary Features
Input Sequence
OTHER PERSON OTHER ORG TITLE
y
y
y
y
y
t2
t3
t
-
1
t
t1
. . .
x
x
x
x
x
t
2
t
3
t
t
1
-
t
1
said Ling a Microsoft VP
- Widely applicable, many positive results e.g.
speech recognition - Fact Extraction (from Blogs and Wikis)
- Address extraction ?
8Research Result - Training a CRF
- Define the vector of feature values a time t
- Define the global feature function as
- The gradient of the conditional log likelihood
Model expectation, i.e.
Empirical expectation
9Results CRF Training
Accuracy Fixed 85.7KL 91.6Exact 91.6
NetTalk text-to-speech Linear-chain CRF training
using sparse inference
75 less training time than exact training, with
no loss in accuracy
10SenseCam Enhanced Blogs
Produce Lots of Data for Location Recognition
11Multi-Conditional Learning
- Motivation - Simple GMM Example
Joint Conditional Multi-Conditional
12Multi-Conditional Learning
- One motivation Conditional Random Fields can be
derived from a traditional joint model - But, there are many other conditional
distributions that could be defined - What do we gain if we model those as well?
- Other combinations possible
13Image Segmentation/Pixel Classification
MSR Cambridge / Berkeley Data
14Mixtures of Factor Analyzers
- Generative model for simultaneous dimensionality
reduction and clustering - We wish to obtain a discriminative version of
this type of model discriminatively
15Performance vs. Model Complexity
Interesting ?
Joint Optimization benefits more substantially
from additional data.
16Performance with More Data
Training Set Accuracy Test Set Accuracy
hmm
17Search Blogs of Friends
18Detect and Find Expert Knowledge
19Simple Exponential Family Models for Documents
20Results Document Classification
21New Graphical Models for Email and Blogs
- Scenario Predict which friends might be
interested in your new Blog entry
- function - random variable - N
replications
PredictedRecipient
y
N
The graph describes the joint distribution of
random variables in term of the product of local
functions
xb
xs
xr
Nb
Ns
Nr-1
Email Model Nb words in the body, Ns words in
the subject, Nr recipients
Body Title FriendsWords Words
discussed
Nr
- New Idea Plated Factor Graphs
22Detect Quality Content and Encourage Knowledge
Contributions
23Conclusions, Present Future Work
- WikiGIS Merged Blogs, Blikis and Wikis with
Microsoft Virtual Earth - Merge the SenseCam with a smart Phone- Enable
Intelligent Digital Assistants - Output to the
television - Next Steps Location and object recognition
enabling information retrieval - Other Uses Assistive Technology for the Elderly
24References Results so Far
- with Charles Sutton and Andrew McCallum. Sparse
Forward-Backward using Minimum Divergence Beams
for Fast Training of Conditional Random Fields.
In proceedings of ICASSP 2006. - with Michael Kelm and Andrew McCallum. Combining
Generative and Discriminative Methods for Pixel
Classification with Multi-Conditional Learning To
appear in the proceedings of ICPR 2006. - with Andrew McCallum, Greg Druck and Xuerui Wang.
Multi-Conditional Learning Generative/
Discriminative Training for Clustering and
Classification To appear in the proceedings of
AAAI 2006. - CC Prediction with graphical models To appear in
the proceedings of CEAS 2006.