Title: Local Affine Feature Tracking in Films/Sitcoms
1Local Affine Feature Tracking in Films/Sitcoms
- Chunhui Gu
- CS 294-6
- Final Presentation
- Dec. 13, 2006
2Objective
- Automatically detect and track local affine
features in film/sitcom frame sequences. - Current Dataset Sex and the City
- Why sitcom?
- Simple daily environment
- Few or no special effects
- Repeated scenes
3Outline
- Preprocessing
- Tracking Algorithm
- Pairwise local matching
- Robust features
- Feature Matching across Shots
- Results
- Feature matching vs baseline color histogram
- Time complexity
- When does tracking fail
4Preprocessing
Frame Extraction
Shot Detection
MSER Interest Point Detection
SIFT Feature Extraction
5Tracking Algorithm
Frame i
Frame ji1
6Tracking Algorithm
Frame i
Frame ji1
7Tracking Algorithm
Frame i
Frame ji1
Thresholding on both minimum distance and ratio
8Tracking Algorithm
Frame i
Frame ji1
9Tracking Algorithm
Frame i
Frame ji1
10Tracking Algorithm
- Problem of Pairwise Matching
- Sensitive to occlusion and feature misdetection
- Solutions
- Use multiple overlapping windows
- Backward Matching
- Match features in current frame to features in
all previous frames within the shot - Pruning process (reduce computation time)
- Select a proportion of features that have longer
tracking length as robust features
11Shot grouping/Scene Retrieval
12Inter-Shot Matching
Shot I
Shot J
13Confusion Table
14ROC
15When Does Tracking Fail?
- Tracking feature outside local window
- Rare when continuous tracking
- Happens when occlusion occurs
- Same feature splitting to two or more groups
- Long occlusion
- Multiple matching in a single frame
Frame i
Frame ji1
16Computation Complexity
- Everything except for MSER and SIFT algorithms
are implemented in Matlab (slow)
Complexity Time
Frame Extraction O(N) 0.3s/frame
Shot Detection O(Nf(B)) 0.07s/frame (B16)
MSER Detection O(N) 0.3s/frame
SIFT Detection O(N) 0.9s/frame
Feature Tracking O(NFWL) 0.5s/frame
Matching across shots O(S2T2) 1s/shot pair
N of frames (30,000) B of bins for color
hist (16) F ave. of features per frame (400)
W Local window size (15) L tracking length
(20) T ave. of robust trackers per shot
(300) S of shots (35)
17Conclusion
- We successfully implemented local affine feature
tracking in sitcom sex and the city. The
tracking method is robust to occlusion and
feature misdetection. - Although no quantitative precision/recall curve
(hard to find ground truth), the demonstration
shows that precision is almost perfect with good
recall performance. - We show one successful application of using
robust features to associate similar shots
together for scene retrieval.
18Future Work
- Implement algorithm in real-time (C/C)
- Search unique shots in films/sitcoms
- Separate indoor scenes from outdoor scenes
- Determine context of the scene
19Acknowledgement