Title: Composition-Guided Image Acquisition
1Composition-Guided Image Acquisition
Serene Banerjee Ph.D. Defense, April 28th, 2004
http//www.ece.utexas.edu/serene
Committee Members Prof. Ross Baldick Prof.
Alan C. Bovik Prof. Brian L. Evans
(Advisor) Prof. Wilson S. Geisler Prof. Joydeep
Ghosh Prof. Robert W. Heath, Jr.
Computer Engineering Curriculum Track Dept. of
Electrical and Computer Engineering The
University of Texas at Austin
2One day Alice came to a fork in the road and saw
a Cheshire cat in a tree. Which Road do I
take? she asked. Where do you want to go? was
his response. I dont know, Alice answered.
Then, said the cat, it doesnt matter.
Lewis Carroll Alice in Wonderland
3Outline
- Introduction
- Motivation
- Overview of contributions
- Summary of previous research for main subject
detection - Contributions
- Online main subject detection
- Aesthetic enhancements, given main subject
- Blur background objects merging with main subject
- Conclusions
4Motivation
- Problem Amateur photographers take unappealing
pictures (e.g. personal and business use) - Help users take better pictures with digital
cameras
5Enhance Picture Appeal
- Improving photograph appeal Savakis, Etz Loui
2000 - Photographic composition
- Objective measures
- People/expression
- Examples of photographic composition rules
6Enhance Acquired Picture Appeal
- Goal Provide well-composed alternative pictures
during image acquisition in digital still cameras - Solution Framework for in-camera automation of
photographic composition rules - Acquire picture user intended to take
- Locate main subject by combining optical and
digital image processing on a supplementary
picture - Apply composition rules to user-intended picture
- Place main subject according to rule-of-thirds
- Blur entire background given main subject
location - Blur background objects that merge with main
subject - User takes intended picture and framework also
returns three alternative pictures
7Offline Main Subject Detection
- Neural network based training Luo, Etz, Singhal
Gray 2000-2001 - Cluster multi-level wavelet coefficients Wang et
al. 1999-2001 - Iterative classification from variance maps Won,
Pyan Gray 2002
Algorithm Training complexity Runtime complexity
Neural network Difficult to form widely applicable training set High (e.g. feature extraction, grouping)
Wavelet-based No training required High (e.g. wavelet, k-means clustering)
Variance-based No training required High (e.g. iterations, watershed)
8Automating Composition Rules
Original color image
- In-camera online framework
- Provide alternatives to user during image
acquisition - One-pass low-complexity algorithms Banerjee
Evans 2003-04 - Independent of scene content and setting
- Amenable to fixed-point implementation
- Match processing on digital still cameras
Detect main subject
Supplementary picture
Generated picture with rule-of-thirds
Rule-of-thirds
Background blur
Generated picture with blur
Mitigate merger
Generated picture without mergers
9Digital Still Cameras
- Converts optical image to electric signal
- Software control
- Shutter aperture and speed
- Focus
- Zoom
- White balance
- Additional hardware could control
- Camera angle
- Aspect ratio landscape or portrait
10Outline
- Introduction
- Contributions
- Online main subject detection
- In-camera segmentation of the main subject
- Low-complexity one-pass algorithm
- Amenable to implementation in digital still
cameras - Aesthetic enhancement, given main subject
- Mitigation of mergers with background objects
- Conclusions
11Online Main Subject Detection
Contribution 1
- Auto-focus main subject
- Take supplementary picture
- Open shutter aperture (takes 1s) to blur objects
not in focus - In-focus edges stronger than out-of-focus edges
- Process supplementary picture to find main
subject mask - Enhance in-focus edges
- Detect strong edges
- Close boundary
Scene
Auto-focus filter
Open shutter for blur
Supplementary picture
Compute intensity
3x3 Highpass filter
Detect sharper edges
Close boundary
Binary main subject mask
12Main Subject Detection Formulation
Contribution 1
- Supplementary picture has intensity function, I
- IH and IL are highpass and lowpass versions
- For background image, contribution from IL is
greater - Goal Identify pixels contributing high
frequencies - I is modeled as mixture of IH and IL
- Highpass filtering of I enhances main subject
edges
where k ? 1
13Step 1 Enhance In-focus Edges
Contribution 1
- Subtract smoothed image from sharpened one
- Strong edges in main subject, weak edges in
background
Highboost image
S
-
Edge-enhanced image with stronger main subject
edges
Supplementary image
Lowpass image
14Step 2 Detect Strong Edges
Contribution 1
- Canny edge detector detects strong edges Canny
1986 - Selects weak edges only if they are connected to
strong edges - Laplacian of Gaussian detector Burt Adelson
1983 - Selects edges based on zero crossings of second
derivative - Either detects weak and strong edges or
eliminates weak edges from main subject (depends
on threshold)
Laplacian of Gaussian
Canny edge detector
15Step 3 Generate Mask
Contribution 1
- Goal Generate closed contour from strong edges
- Gradient vector flow Xu, Yezzi Prince 2001
- Balances forces
- Internal spline characteristics
- External normal of gradient of detected strong
edges - Outer boundary of detected sharp edges is initial
contour - Change shape of initial contour, depending on
gradient - Approximate lower complexity method
- Select leftmost rightmost ON pixel and
- make row pixels in between them ON
- Can detect convex regions but fails at concavities
16Main Subject Detection Results
Contribution 1
Supplementary image
Step 2 Strong edge detection
Step 3a Gradient of strong edges
Step 1 Edge map
Step 3b Gradient vector flow field
Step 3c Initial contour
Step 3d Contour after 5 iterations (not
mandatory)
Main subject mask
17Implementation Complexity
Contribution 1
- Per-pixel complexity for algorithm Banerjee
Evans 2003-04 - Multi-level wavelet based Wang, Lee, Gray,
Wiederhold 1999-2001 - Variance of multi-level wavelet coefficients 2X
increase - k-means clustering 2(image size)(no. of
iterations)X increase - Iterative classification from variance maps Won
et al. 2002 - Iterative maximum a posteriori segmentation 3X
increase - Watershed refinement 6 passes per pixel
Process/Operation Multiply-accumulates Compares Memory accesses
Pre-filtering (3x3) 9
Edge detection 9 2 5
Close boundary 2 1
18Comparison With Previous Methods
Contribution 1
Proposed algorithm
Original image
Banerjee Evans 2003-4
19Limitations
Contribution 1
- Frequency-based features not applicable if
- Main subject does not have enough high
frequencies - Background not blurry enough
- Could incorporate region-based features
Example of an image where the proposed algorithm
fails to detect the main subject, the flower
20Outline
- Introduction
- Contributions
- Main subject detection
- Aesthetic enhancement, given main subject
- Reposition main subject to follow rule-of-thirds
- Simulate background blur for motion or clarity
- Mitigation of mergers with background objects
- Conclusions
21 Rule-of-Thirds
Contribution 2
- Better interaction of main subject with image
background - Center of mass of main subject at 1/3 or 2/3
picture width (or height) from the left (or top)
edge
Main subject in center of picture
Main subject follows rule-of-thirds
Outdoor setting the flower is main subject
22Rule-of-Thirds Algorithm
Contribution 2
- Compute center-of-mass of main subject
- 2 multiply-accumulates, 1 memory read per pixel
- 1 division per image
- Locate closest one-third corner
- 8 compares per image (4 comparisons of (x,y)
points) - Shift picture so center-of-mass falls at desired
corner - Mirror undefined boundary pixels
- Best case no change to image
- Worst case 1/3 rows/columns need to be shifted
- Average (main subject in middle) shift 1/6
rows/columns - 0 to 2 memory accesses per pixel
23 Ideal Background Blur Example
Contribution 2
Background blur emphasizes main subject, the
shell, and aids in constrained image
communication
Indoor setting no humans in picture
24Simulated Background Blur
Contribution 2
- Possible camera blurs
- Background blur shutter aperture
- Linear blur subject or camera motion
- Radial blur camera rotation
- Zoom change in zoom
- Digital alternatives
- Original image masked with detected main subject
mask - Region of interest filtering performed on
non-masked pixels - Complexity 9 multiply-accumulates and 4 memory
accesses per pixel for convolution with symmetric
3x3 filter
25Results (1)
Contribution 2
Supplementary image with main subject(s) in focus
Detected main subject mask
Rule-of-Thirds Main subject repositioned
Simulated background blur
Outdoor setting human main subject
26Results (2)
Contribution 2
Supplementary image with main subject(s) in focus
Detected main subject mask
Rule-of-Thirds Main subject repositioned
Simulated background blur
Outdoor setting human main subject
27Results (3)
Contribution 2
Supplementary image with main subject(s) in focus
Detected main subject mask
Rule-of-Thirds Main subject repositioned
Simulated background blur
Indoor setting no human subjects
28Outline
- Introduction
- Contributions
- Main subject detection
- Aesthetic enhancement, given main subject
- Mitigation of mergers with background objects
- Framework for background analysis and merger
detection - Low-complexity one-pass algorithm for merger
mitigation - Conclusions
29 Ideal Merger Mitigation Example
Contribution 3
Unwanted mergers avoided
Background bar merges with gymnasts hand
30Mitigation of Mergers Overview
Contribution 3
- Goal Identify background objects merging with
main subject - In-focus background object
- Connected to main subject mask
- Large area relative to image size
- Merger detection
- Color segmentation based on hue
- Identify distracting background object based on
distance to main subject and frequency content - Blur merging background objects to induce a sense
of distance
Merging background objects trees and bush over
right shoulder
31Segmentation of Background Objects
Contribution 3
- Hues above histogram average are dominant hues
- Background is a mixture of dominant hues
- Thresholds average of two consecutive dominant
hues
Histogram of background hues and identified
objects
Background hues
Thresholds 87, 151
32Merger Object Detection
Contribution 3
- Define Frequency Inverse Distance Measure for
- each disjoint background object Oi
- Decreases with nearest distance (di) from main
subject - Increases with high spatial frequency
coefficients (?iH) - Merged object Object with highest transform value
33Measure Selection
Contribution 3
- Linear, division, and exponential forms to
combine - High frequencies computed with residual in
Gaussian pyramid decomposition - Euclidean distance measured from main subject
mask
Attribute Linear Divisional Exponential
Computational complexity Low High High
Merged objects size Large Small Small
34Merger Mitigation Results
Contribution 3
Background tree and bush merging with main subject
Blurred tree and bush appear to be farther away
High frequency and inv. distance values for
background
35Per-pixel Implementation Complexity
Contribution 3
Process /Operation Multiply-accumulates Compares Memory accesses
RGB to hue 3 6 4
Histogram and thresholding 1 2
RGB to intensity 2
Gaussian pyramid 9 4
Approx. inv. distance measure 2 1 2
Detect merged object 1 1
Gaussian pyramid reconstruction 9 1 5
TOTAL 27 11 15
For comparison, JPEG compression takes 60
operations/pixel
36System Prototype
Scene
Original color image
Color Gaussian pyramid
Transform coefficients
Inverse distance transform
Supplementary image
Compute intensity
Grayscale image
Grayscale image
Background segmentation
X
3x3 Highpass filter
Detect sharper edges
Close boundary
Intensity Gaussian pyramid
Measure how close rule-of-thirds followed
Binary main subject mask
Detect merging object
Automate rule-of-thirds
Simulate background blur
Reconstruct color pyramid
Generated picture with rule-of-thirds
Generated picture with blur
Merger mitigated picture
37Conclusion
- Contributions
- Combined optical/digital image acquisition
- Provide online feedback to amateur photographers
- Low-complexity one-pass method for main subject
detection - Rule-of-thirds placement of the main subject on
the canvas - Simulated background blur motion and
depth-of-field - Mitigation of mergers with background objects
- Deliverables
- Prototype development for digital still image
acquisition - Copies of MATLAB code, slides, and papers,
available at - http//www.ece.utexas.edu/bevans/projects/dsc/in
dex.html
38Future Work
- Automate other photographic composition rules
- Best zoom
- Available frames, lines of interest, best angle,
balanced picture - Extension for video acquisition
- Frame-by-frame basis
- Compressed domain
- Digital image stabilization Subject mask as
feature - Potential research impact
- Video cameras, Surveillance, Image/video
retrieval, Constrained image/video communication,
Main subject detection for specific applications