Title: Evaluating the Quality of Image Synthesis and Analysis Techniques
1Evaluating the Quality of Image Synthesis and
Analysis Techniques
- Matthew O. Ward
- Computer Science Department
- Worcester Polytechnic Institute
2Evaluation is Important
- Has my modification improved the results?
- Which method works best for my data?
- What are limitations of my technique?
- Is my method better than XXX?
3Evaluation is Difficult
- What aspects to test?
- How to measure?
- What are limitations of evaluation procedure?
- How to recruit evaluators?
4Evaluation Often Avoided
- Majority of papers show no substantive evaluation
- Most common approach is subjective, by authors,
on 1-3 test cases - Quantitative measures exist for computational
performance, but not quality of results - Not much glory in evaluation
5Case Study 1 Data Visualization
6The Problem.....
- Lack of rigorous assessment of visualization
techniques - Lack of good test cases
- Limited comparison with other techniques
- Lack of guidelines for selection of appropriate
techniques
7A Possible Solution.....
- Create a list of goals of visualization
- what is the overall task?
- what is desired/acceptable level of accuracy?
- what are we looking for?
- Locate/create data sets which contain desired
features - Test users on a wide range of tasks using
different visualization methods
8Goals of Visualization
- Identification - is there some interesting
feature in the image? - Classification - what is it?
- Quantification - how many? how big? how close?
- Understanding - are there correlations conveyed
by the image? - Comparison - does the image have characteristics
similar to one generated with a different set of
data?
9Advantages of Synthetic Data
- Easy to adjust characteristics
- Less ambiguous than real data
- Easy to create data which contains a single
structure or phenomena - Real data can be noisy
- Hard to find real data with desired
characteristics
10Advantages of Real Data
- Results using real data are more believable
- Reality is hard to simulate accurately
- Real data has context which can help justify
usefulness of tasks
11Our Experiments
- Select two data characteristics of interest
(outliers and clusters) - Locate real data sets containing these features
(validate with statistical analysis) - Create synthetic data sets containing these
features (also validate) - Select three visualization techniques to test
(scatterplots, parallel coordinates, principal
components analysis with glyphs)
12Our Experiments (continued)
- Train subjects on interpreting different display
techniques - Train subjects on the desired data
characteristics - Test subjects on each characteristic, varying
- number of outliers/clusters
- degree or size
- amount of noise in synthetic sets
- location of outlier/clusters
13Visualization Techniques Tested
Glyphs
Scatterplot Matrix
Parallel Coordinates
14Outlier Example
15Cluster Example
Original
Added Noise
16Assessing the Results
- Detection - did subjects identify some structure
in the image? - Classification - did subjects correctly classify
structure? - Measurement -
- number of clusters or outliers
- outlier and cluster degree of separation
- size of cluster
- Errors - false positives, missed structure,
measurement accuracy
17Summary of Experiments
- Scatterplot matrix
- best overall
- weak on overlapping clusters, size estimation for
large clusters, interior outliers - Glyphs
- best for interior outliers
- good for conveying outlier separation,
overlapping clusters, measuring cluster size - poor for differentiating non-outliers
- Parallel coordinates
- generally worse than others
- good for differentiating non-outliers
18Future Work
- Test alternate data characteristics (e.g.
repeated patterns) - Test alternate perceptual tasks (e.g.
correlation) - Test other visualization techniques (e.g.
alternate glyphs, VisDB, dimensional stacking..) - Create publicly available benchmark suite for
data sets and analysis tools (submissions from
other researchers always welcome) - Compare other multivariate visualization
assessment methods as they arise.
19Case Study 2 Image Segmentation
20Problem Statement
- Image segmentation algorithms traditionally
classified as model-based or context-free - Model-based methods highly effective, but
expensive to design and execute - Context-free methods are fast, but quality of
results often poor - Is there some way to improve the results of
context-free systems without incurring costs of
model-based methods?
21Conjecture
- In most image analysis domains, expectations can
be placed on the likely occurrence of certain
shapes, colors, and region/segment sizes. - Objects in an office scene mostly planar and
non-specular - In medical images, boundaries are mostly smooth,
and regions are usually small or moderate in size - Outdoor scenes contain a lot of fine texture
- We should be able to use high-level domain
constraint knowledge to improve the segmentation
process by - Selecting a segmentation method likely to produce
good results - Set the segmentation parameters to their most
effective values
22Defining a Good Segmentation
- All physical object boundaries should be isolated
- False boundaries should be minimized
- Boundary shape should be comparable to internal
model of object in scene - Precision in shape and position needed varies
based on application and importance of individual
objects to task at hand
23Defining a Good Evaluation Procedure
- Should be based on real images
- Influence of human subjectivity minimized
- Errors categorized by type, severity, and
significance - Magnitude of error should accurately reflect
difference from ideal - Tolerance must be permitted
24Problems with Pixel Counting
2 images with similar error counts, uniform
dilation (left) and bad merge (right).
25Procedure
- Acquire representative set of images for multiple
domains - Approximate constraints on edge/region features
in domain - Interactively segment and label edges/region
tolerance and priority to create ideal
segmentation - Compute errors between ideal and algorithmically
generated segmentation - If error gt acceptable, adjust parameters (simplex
algorithm) and recompute errors - Associate segmentation parameters with domain
constraints
26Creating the Ideal Segmentation
- Start with initial region-based segmentation
- Click on a region of interest
- Merge, split, set tolerance level, set priority
level - Iterate until all significant regions labeled
- Results are domain and task specific
27Comparing Ideal to Computed Results
Edge Detection 78 detected essentials
209 oversegmentation
Region Growing 67 detected essentials
120 oversegmentation
28More Results
Split and Merge 79 detected essentials
73 oversegmentation
Rule-based System 88 detected essentials
93 oversegmentation
29Summary and Future Work
- Domain constraints produced better segmentations
than context-free methods (after training) - Future work includes investigating other types of
constraints (e.g., texture) and improve the
tolerance specification and error calculation
30General Procedure for Assessing Image-Based
Algorithms
- Determine the task to be performed by user of
image - Determine image features most relevant to this
task, and ascertain level of accuracy needed in
detection, classification, and measuring - Create benchmark suite of data containing these
features in varying degrees (real and synthetic
data) - Create and administer user tests to evaluate
effectiveness of algorithm to accurately and
reliably convey desired data features, or - Develop image processing algorithms to identify
desired data features and calculate error types
and severities in images generated by algorithm
being assessed
31Summary of Presentation
- Formal assessment has proven useful in both
visualization and image processing applications - Results can be used to guide algorithm
development and selection - Quantitative and qualitative approaches can
provide many insights into effectiveness of image
analysis and synthesis tasks