Title: Anthony Santella, Maneesh Agrawala, Doug DeCarlo, David Salesin, Michael Cohen
1Gaze-Based Interaction for Semi-Automatic Photo
Cropping
- Anthony Santella, Maneesh Agrawala, Doug DeCarlo,
David Salesin, Michael Cohen
2Problem
- Crop images with minimal effort
Original
Crop
3Application
4Application
- Adaptive Documents (Jacobs et al., 2003)
5Problem
- Crop images with minimal interaction
- Sub-problems
- Identify important image content
6Problem
- Crop images with minimal interaction
- Sub-problems
- Identify important image content
- Eye Tracking
7Problem
- Crop images with minimal interaction
- Sub-problems
- Identify important image content
- Eye Tracking
- Determine a goodcomposition
8Identifying Image Content
- What is the subject of this picture?
9Identifying Image Content Interactive
- Prior art draw box (traditional cropping)
10Identifying Image ContentAutomatic
- Prior art have the computer guess
- Detect faces
- Suh et al. 2002
- Chen et al. 2003
- Byers et al. 2003
11Identifying Image ContentAutomatic
- Prior art have the computer guess
- Salience Identify prominent regions e.g. Itti
et al. 1998, Itti and Koch 2000 - Suh et al. 2002
- Chen et al. 2003
- Setlur et al. 2004
12Identifying Image ContentAutomatic
- Prior art have the computer guess
- Salience Identify prominent regions e.g. Itti
et al. 1998, Itti and Koch 2000 - Suh et al. 2002
- Chen et al. 2003
- Setlur et al. 2004
13Identifying Image Content
- Prior art salience (Itti and Koch, 2000)
Salience estimate
14Identifying Image Content
- Prior art salience (Itti and Koch, 2000)
Salience estimate
15Eye Tracking
- Effortless
- Someday (soonish) ubiquitous
16Our Approach Eye Tracking
17Our Approach Eye tracking
Fixation
18Our Approach Eye tracking
- Record of overt attention
- For interaction and evaluation
- Vertegaal, 1999
- Duchowski, 2000
- Crowe et al., 2000
- DeCarlo and Santella, 2004
19Our Approach
Our Approach
- A viewer examines the image
20Our Approach
Our Approach
- A viewer examines the image
21Our Approach
- System segments image (Christoudias et al., 2002)
22Our Goal Identifying Content
- Relate fixations to segmentation
- Label segments as content or background
Content
23Our Approach Labeling Regions
- Assign each region a value indicating how much it
was examined
24Our Approach Partial Labeling
- Most viewed 10 of regions are subject
- Least viewed 50 are background
subject
unknown
background
25Our Approach Graph Cut
- Propagate partial labels to rest of image Lazy
Snapping, Li et al., 2004
subject
background
26Our Approach cropping
- Implement basic rules
- Keep crop tight
- Keep subject
- Avoid cutting through subject
- Avoid cutting background elements
27Rules
- Keep crop tight Minimize fraction ofarea
retained - Keep subject Minimize fraction of subject area
lost
28Rules
- Keep crop tight Minimize fraction of area
retained - Keep subject Minimize fraction of subject area
lost
29Rules
- Keep crop tight Minimize fraction of area
retained - Keep subject Minimize fraction of subject area
lost
30Rules
- Keep crop tight Minimize fraction ofarea
retained - Keep subject Minimize fraction of subject area
lost - Together result in tight inclusive crop
31Rules
- Avoid cutting subject Minimize amount of
subject on boundary - Avoid cutting background elements Minimize
crossings of segmentation
32Rules
- Avoid cutting subject Minimize amount of
subject on boundary - Avoid cutting background elements Minimize
crossings of segmentation
33Rules
- Avoid cutting subject Minimize amount of
subject on boundary - Avoid cutting background elements Minimize
crossings of segmentation
34Optimization
- Weights control relative importance of each rule
- Exhaustive search minimizes weighted sum of rule
scores to find crop rectangle that best respects
rules
35Results
Original
36Results
Original
37Results
Original
Crops
38Results
Original
Crops
39Results
Original
40Results
Original
Crops
41Results
Original
Crops
42Results
Original
Crops
43Evaluation
44Evaluation
Original
Hand made
Salience(Suh et al. 03)
Gaze-Based
45Experiment
- Exhaustive forced choice
- Eight subjects
- 50 images
- 350 trials per subject
Which image looks better?
46Evaluation Results
- Preference data not significant
- Original Salience Gaze Hand
- Original - .511 .439 .266
- Salience .489 - .416 .339
- Gaze .561 .581 - .325
- Hand .734 .661 .675 -
47Evaluation Results
- Preference data not significant
- Original Salience Gaze Hand
- Original - .511 .439 .266
- Salience .489 - .416 .339
- Gaze .561 .581 - .325
- Hand .734 .661 .675 -
48Evaluation Results
- Preference data not significant
- Original Salience Gaze Hand
- Original - .511 .439 .266
- Salience .489 - .416 .339
- Gaze .561 .581 - .325
- Hand .734 .661 .675 -
49Evaluation Results
- Preference data not significant
- Original Salience Gaze Hand
- Original - .511 .439 .266
- Salience .489 - .416 .339
- Gaze .561 .581 - .325
- Hand .734 .661 .675 -
- Kendall analysis
- hand gt gaze-based gt salience gt original (plt.01)
50Future Work
Crops
51Future Work
- Eye tracking in context
- Real environment
- Range of real tasks
52Future Work
- Limitations of eye tracking small things
53Future Work
- Limitations of eye tracking small things
54Future Work
- Limitations of eye tracking small things
55Future Work
- Quantitative measures of composition and design
an interesting area of research - Strong future potential for implicit interaction
with images
56Thank You
- Deep thanks to B. Suh, H. Ling, B. Bederson and
D. Jacobs for access to their salience cropping
system - Thanks also to Eileen Kowler, Mary Czerwinski, Ed
Cutrell and to Phillip Greenspun for several
photos - This research is partially supported by the NSF
through grant HLC 0308121