Title: Kim, Seitz, Agrawala
1Video-Based Document TrackingUnifying Your
Physical and Electronic Desktops
- Jiwon Kim Steven M. Seitz Maneesh Agrawala
- University of Washington Microsoft Research
2Motivation
3Unifying physical andelectronic desktops
Video camera
- Record video of paper on physical desktop
Desktop
4Unifying physical andelectronic desktops
Video camera
- Record video of paper on physical desktop
- Tracking
Desktop
5Unifying physical andelectronic desktops
Video camera
- Record video of paper on physical desktop
- Tracking
- Recognition
Desktop
6Unifying physical andelectronic desktops
Video camera
- Record video of paper on physical desktop
- Tracking
- Recognition
- Linking
Desktop
7Applications
Video camera
Desktop
8Applications
Video camera
Desktop
9Applications
Video camera
- Find lost document
- Browse remote desk
Desktop
10Applications
Video camera
- Find lost document
- Browse remote desk
- Find electronic version
Desktop
11Applications
Video camera
- Find lost document
- Browse remote desk
- Find electronic version
- History-based queries
Desktop
12Example Input Video
13Demo Remote Desktop
14Related Work
DigitalDesk Wellner 93
15Related Work
Self-Organizing Desk Rus et al. 97
DigitalDesk Wellner 93
16Related Work
- Interactive desktops
- Augmented paper
PADD Guimbretière 03
17Related Work
- Interactive desktops
- Augmented paper
CyberCode Rekimoto et al. 00
PADD Guimbretière 03
18Related Work
- Interactive desktops
- Alternative media
- Object tracking recognition
SIFT Lowe 04
19System Overview
Video camera
Computer
User
Desk
20System Overview
Video of desk
21System Overview
Images from PDF
Video of desk
22System Overview
Images from PDF
Video of desk
Track recognize
23System Overview
Internal representation
Images from PDF
Video of desk
Track recognize
T
T1
24System Overview
Internal representation
Images from PDF
Video of desk
Track recognize
T
T1
Scene Graph
25System Overview
Where is my W-2?
Internal representation
Images from PDF
Video of desk
Track recognize
T
T1
26System Overview
Where is my W-2?
Answer
Internal representation
Images from PDF
Video of desk
Track recognize
Desk
Desk
T
T1
27System Overview
Where is my W-2?
Internal representation
Images from PDF
Video of desk
Track recognize
T
T1
28Tracking Recognition
29Tracking Recognition
Event
30Event Types
before
after
Move
31Event Types
before
after
Move
Entry
32Event Types
before
after
Move
Entry
Exit
33Tracking Recognition
Event
Desk
34Tracking Recognition
Event
Desk
Desk
35Tracking Recognition
Event
sanders01.pdf
lowe04sift.pdf
tut-article.pdf
objectspaces.pdf
kidd94.pdf
Desk
Desk
36Assumptions
- Document
- Corresponding electronic copy exists
- No duplicates of same document
37Assumptions
- Document
- Corresponding electronic copy exists
- No duplicates of same document
- Motion
- 3 event types move/entry/exit
- One document at a time
- Only topmost document can move
38Non-Assumptions
- Desk need not be initially empty
39Non-Assumptions
- Desk need not be initially empty
- Stacks may overlap
40Algorithm Overview
Input Frames
41Algorithm Overview
Input Frames
Event Detection
before
after
42Algorithm Overview
Input Frames
Event Detection
before
after
Event Interpretation
A document moved from (x1,y1) to (x2,y2)
43Algorithm Overview
Input Frames
Event Detection
before
after
Event Interpretation
A document moved from (x1,y1) to (x2,y2)
File1.pdf
Document Recognition
File2.pdf
File3.pdf
44Algorithm Overview
Input Frames
Event Detection
before
after
Event Interpretation
A document moved from (x1,y1) to (x2,y2)
File1.pdf
Document Recognition
File2.pdf
File3.pdf
Scene Graph Update
Desk
Desk
45Algorithm Overview
Input Frames
Event Detection
before
after
Event Interpretation
A document moved from (x1,y1) to (x2,y2)
File1.pdf
Document Recognition
File2.pdf
File3.pdf
Scene Graph Update
Desk
Desk
46Event Detection
47Event Detection
Frame differences
time
48Event Detection
Frame differences
time
49Event Detection
Frame differences
time
50Event Detection
Frame differences
time
51Event Detection
Frame differences
time
52Event Detection
Motion Frames
Event Frames
Threshold
Image motion
time
time
53Event Detection
Motion Frames
before
after
54Algorithm Overview
Input Frames
Event Detection
before
after
Event Interpretation
A document moved from (x1,y1) to (x2,y2)
File1.pdf
Document Recognition
File2.pdf
File3.pdf
Scene Graph Update
Desk
Desk
55Event Interpretation
before
after
Move
Entry
Exit
56Event Interpretation
before
after
Move
1. Move vs. Entry/Exit
Entry
Exit
57Event Interpretation
before
after
Move
Entry
2. Entry vs. Exit
Exit
58Event Interpretation
before
after
Move
1. Move vs. Entry/Exit
Entry
Exit
59Move vs. Entry/Exit
before
after
60Move vs. Entry/Exit
before
after
61Move vs. Entry/Exit
before
after
62Move vs. Entry/Exit
before
after
63Event Interpretation
- Use SIFT Lowe 04
- Rotation- and scale-invariant
- Highly distinctive (128-bit vector)
64Move vs. Entry/Exit
before
after
65Move vs. Entry/Exit
before
after
66Move vs. Entry/Exit
before
after
67Move vs. Entry/Exit
before
after
68Move vs. Entry/Exit
before
after
69Move vs. Entry/Exit
before
after
70Move vs. Entry/Exit
Motion (x,y,?)
before
after
71Algorithm Overview
Input Frames
Event Detection
before
after
Event Interpretation
A document moved from (x1,y1) to (x2,y2)
File1.pdf
Document Recognition
File2.pdf
File3.pdf
Scene Graph Update
Desk
Desk
72Document Recognition
- Match against PDF image database
File2.pdf
File3.pdf
File4.pdf
File5.pdf
File6.pdf
File1.pdf
73Document Recognition
- Performance analysis
- Tested 20 pages against database of 162 pages
74Document Recognition
- Performance analysis
- Tested 20 pages against database of 162 pages
- 200x300 pixels per document for reliable match
Recognition Rate
Document Resolution
75Document Recognition
- Performance analysis
- Tested 20 pages against database of 162 pages
- 200x300 pixels per document for reliable match
0.9
Recognition Rate
300
Document Resolution
76Algorithm Overview
Input Frames
Event Detection
before
after
Event Interpretation
A document moved from (x1,y1) to (x2,y2)
File1.pdf
Document Recognition
File2.pdf
File3.pdf
Scene Graph Update
Desk
Desk
77Scene Graph Update
Motion (x,y,?)
after
before
Desk
78Scene Graph Update
Motion (x,y,?)
after
before
Desk
79Scene Graph Update
Motion (x,y,?)
after
before
Desk
Desk
80Results
- Input video
- 40 minutes
- 1024x768 _at_ 15 fps
- 22 documents, 49 events
- Running time
- Video processed offline
- No optimization
- A few hours for entire video
81Demo Paper tracking
82Photo Sorting Example
83Photo Sorting Example
84Demo Photo Sorting
85Future Work
- Enhance realism
- More applications
86Future Work
- Enhance realism
- Handle more realistic desktops
87Moving a stack of documents
88Documents with no electronic versions
89Future Work
- Enhance realism
- Handle more realistic desktops
- Real-time performance
90Future Work
- More applications
- Support other document tasks
- E.g., attach reminder, cluster documents
91Future Work
- More applications
- Support other document tasks
- E.g., attach reminder, cluster documents
- Beyond documents
92Future Work
- More applications
- Support other document tasks
- E.g., attach reminder, cluster documents
- Beyond documents
93Future Work
- More applications
- Support other document tasks
- E.g., attach reminder, cluster documents
- Beyond documents
94Future Work
- More applications
- Support other document tasks
- E.g., attach reminder, cluster documents
- Beyond documents
95Acknowledgments