ECSE6963, BMED 6961 Cell - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

ECSE6963, BMED 6961 Cell

Description:

Shape Features. Absorption Features. Texture Features. Good Features. A good feature ... Sometimes, a good basis for improving specimen preparation and imaging steps ... – PowerPoint PPT presentation

Number of Views:35
Avg rating:3.0/5.0
Slides: 39
Provided by: ecse1
Category:
Tags: bmed | cell | ecse6963

less

Transcript and Presenter's Notes

Title: ECSE6963, BMED 6961 Cell


1
ECSE-6963, BMED 6961Cell Tissue Image Analysis
  • Lecture 16 Feature Selection Validation
  • Badri Roysam
  • Rensselaer Polytechnic Institute, Troy, New York
    12180.

2
Recap Blob Segmentation
3
Recap
  • Four basic ideas in blob segmentation
  • When we encounter a new application, a
    combination of these ideas can be used
  • For highest performance object modeling is
    essential
  • This topic continues to evolve
  • Of course, new ideas continue to emerge
  • E.g., use multiple object models to handle
    diversity of cell types
  • Today
  • Good and bad Features
  • Performance Evaluation Validation

4
The Feature Selection Problem
  • The features that we have studied are only a
    subset of the many that can be defined
  • Its fun to invent new features, but theres a
    caveat to consider
  • If we consider too many features
  • High-dimensional space,
  • need too many examples to estimate model
    parameters
  • How much accuracy do we really need?
  • The curse of dimensionality
  • They may not have enough additional
    discriminatory value
  • Computationally Expensive

Covers Inequality
5
Example Cervical Smears
6
Features of Nuclei in Cervical Smears
Absorption Features
Clump
OD
Energy
Energy
mean
1
2
I
I
Corr
Corr
mea
mean
1
2
OD
var
I
Homog
Homog
norm
2
1
R
I
OD
Entropy
Entropy
max
var
int
1
2
R
Contrast
Contrast
mean
2
1
Area
Bclump
CHomog
Elong
Size Features
Fit
R
var
Texture Features
Tort
Shape Features
7
Good Features
  • A good feature
  • Is significantly different for each class on
    average
  • Has a small variance/spread within each class
  • Maximize the Fisher Discriminant ratio
  • Is not correlated with another feature in use
  • Correlated features are redundant, and increase
    dimensionality without adding value

Bad
Good
8
Discriminating Ability
  • We start by examining the discriminating power of
    each feature independently
  • Qualitative Method
  • Clear separation of classes on a scatter plot or
    histogram
  • Quantitative Method
  • Start with a LABELED scatter plot
  • Define two hypotheses
  • H0 The values of the feature do not differ
    significantly
  • (Null hypothesis)
  • H1 The values of the features differ
    significantly
  • (Alternative hypothesis)
  • The term significantly is quantified by a
    significance level ?.

9
Gaussian Basics
If we gather up enough numbers together, their
average will tend to be Gaussian distributed.
Falls Rapidly 95 of the samples fall within two
standard deviations!
10
Probabilities Tables
Normalization
To calculate probability that x is in a certain
interval, we need to integrate the Gaussian.
Needed Frequently in statistics. Normalize, and
lookup a table of integrals for N(0,1). Boils
down to
Significance level
Acceptance Interval
Old fashioned Table
11
The Sample Mean Variance are Random Variables
Suppose that they are Gaussian distributed, and
mutually independent
12
Hypothesis Testing
Suppose that we have just two classes for now.
13
Hypothesis Testing
test statistic
14
Significance
Suppose we choose a 95 significance level, then
acceptance interval is
If q falls in the above range, decide H0, else
decide H1
15
Case when variances are unknown
?
We can no longer use the Gaussian table Need to
use the T-distribution table instead. Needs two
numbers to look up
DOF
Equal/unequal variance
In MATLAB, H ttest2(x,y,alpha, tail, vartype)
Significance (typ. 5)
16
Discriminant Functions
  • A function of the features that allows us to
    discriminate between classes
  • Generalization of the likelihood ratios and
    thresholds

Linear Discriminant

-
Sign of the discriminant tells us the decision
17
The Next Step
  • The features that pass the individual
    hypothesis tests could still have correlations
    among themselves
  • Correlation implies redundancy, and wasted
    dimensions
  • Procedure
  • Pick the single best feature
  • Try all remaining chosen features one at a time,
    and add the one that gives the best improvement
  • Repeat until
  • The last added feature does not add enough
    improvement to justify an extra dimension

18
Stepwise Discriminant Analysis
  • We can come up with a selection method that goes
    the other way
  • Start off with all features
  • Remove one feature at a time
  • Continue until performance is still acceptable
  • Stepwise discriminant analysis is a method that
    combined top-down and bottom-up approaches
  • Generally, not worth writing our own code
  • Better to use commercial packages
  • The above approaches are still sub-optimal
  • THE VERY BEST approach is to exhaustively
    consider all subsets of features and pick the
    best one.
  • This is very expensive. For example,

19
How do we Test our Model?
Why bother?? Because our model should hold up
over images that we havent processed yet!
Image Selected for Feature Computation and
Labeling
Image to be processed automatically
Select a subset of features, build a discriminant
based on them, and evaluate its effectiveness
over the remaining features
A Batch of Images
20
Features of Nuclei in Cervical Smears
Absorption Features
Clump
OD
Energy
Energy
mean
1
2
I
I
Corr
Corr
mea
mean
1
2
OD
var
I
Homog
Homog
norm
2
1
R
I
OD
Entropy
Entropy
max
var
int
1
2
R
Contrast
Contrast
mean
2
1
Area
Bclump
CHomog
Elong
Size Features
Fit
R
var
Texture Features
Tort
Shape Features
21
Feature Sets Compared
Nearest
Wiener Filter
Neighbor
Deblurred
Raw Data
Deblurred
Data
Data
2-D
Features
3-D
Features
22
Classification Results withLinear Discriminant
Classifier
86
3-D
2-D
Wiener
Wiener
85
Filter
Filter
84
3-D
Nearest
Neighbor
83
3-D
2-D
Nearest
Percent Correct
82
Neighbor
2-D
81
80
79
78
Features Used
23
Stepwise Linear Discriminant Analysis Results
Rank
2-D
2-D
2-D
3-D
3-D Nearest
3-D Wiener
Nearest
Wiener
Neighbor
Neighbor
1
R
R
I
R
R
I
mean
mean
norm
mean
mean
norm
2
Corr
Corr
OD
Corr
Corr
OD
2
2
var
2
2
int
3
Clump
Clump
R
Homog
CHomog
OD
mean
1
mean
4
CHomog
Homog
Entropy
CHomog
Clump
CHomog
2
1
Moral Relative importance of features can be
affected by pre-processing
24
Validation and Performance Assessment
  • Validation
  • Is the software systems output valid?
  • Performance Assessment
  • Exactly how well is the software working?
  • Surprisingly tricky issue given the subjectivity
    and variability of people
  • Inter-subject variability
  • Intra-subject variability

25
Testing Against a Consensus
  • Ask multiple human observers to manually analyze
    the image
  • From scratch, or
  • By editing the machine output
  • Convene a meeting of the human observers
  • Discuss differences of opinion on each cell
  • Develop a single consensus opinion
  • This becomes the Gold Standard
  • Compare the software output against the gold
    standard, and measure concordance

26
Things that commonly go wrong
  • Poor data quality
  • Damaged specimen
  • Mis-shapen objects
  • Fragments
  • Poor image quality
  • Noise
  • Spectral bleed-through
  • Partially-imaged nuclei
  • Types of segmentation errors
  • Miss
  • Inaccurate boundary
  • False segmentation
  • Under segmentation
  • Over segmentation
  • Separation errors

Go back to the microscope if at all possible
27
Handling Partial Objects
  • Usually, they need to be deleted based on their
    features
  • Location (close to border)
  • Size (less than modeled value)
  • Brick Rule
  • Define an interior sub-volume in the image
  • Only accept cells that are wholly contained in it

28
Outlier Detection
Outliers are good candidates for further
inspection
29
Color-codes for highlighting errors
Any measure of the quality of fit to the object
model p(X) can serve as a tool for highlighting
errors Red potentially awful Yellow
questionable Green okay
30
Explanatory Display Coding
  • Make it easy for user to separate unhandled
    errors from handled errors
  • One idea is to put mini explanation codes
  • Display detailed explanations when a user clicks
    on a cell or rests the mouse
  • Keep a record trail of all operations that led
    to each object

31
Object Separation Error Example
Gallery view indicates 3 objects
Split Error
Split Cells
32
Editing the Output
Add Object

Split/Merge

Dilate/Shrink
33
Algorithm to add an object
  • Seeded Region Growing
  • User clicks on a point on the object to be added
  • Initialize a connected component with this point
  • Examine each neighboring pixel
  • If the intensity is within X of starting point,
    include that in the connected component
  • X tolerance set by user
  • Other criteria can be included
  • Stop when there are no more points to add
  • Flexibility in designing stopping criteria!

34
The need to record edits
  • Often, cell segmentation is performed in a
    pharmaceutical and/or legal situation
  • Much at stake, need protection from cheating and
    carelessness!
  • The Food and Drug Administration has laws and
    guidelines, generally called Good Laboratory
    Practices (GLP)
  • Bottom Line
  • When an edit is made, save the original data, and
    allow rollback (undo)
  • Record the time stamp, and identity of the person
    making the edit, and an explanatory note for
    inspector

35
Edits are Valuable!
  • The edit rate is a direct indicator of software
    performance
  • Basis for edit-based validation
  • The types of edits indicate the most common types
    of errors being made by the software
  • Basis for software revision
  • They also indicate the kinds of images and
    objects for which errors are occurring
  • Sometimes, a good basis for improving specimen
    preparation and imaging steps

36
Edit-Based Batch Processing
Model Parameters From Examples
Image Batch
Refine/ReplaceModel/Parameters
Segment ImageN from batch
Yes
High segmentation confidence?
No
Visualization editing System
UpdateTable of Edits
Check edit rate
Fetch next image (N 1)from batch
Too High
Low Enough
ACCEPT
37
Summary
  • The feature selection problem
  • Need to select a few really good ones
  • The curse of dimensionality
  • Multiple-Observer Validation and performance
    assessment
  • Technical dimension
  • Human dimension
  • Legal dimension

38
Instructor Contact Information
  • Badri Roysam
  • Professor of Electrical, Computer, Systems
    Engineering
  • Office JEC 7010
  • Rensselaer Polytechnic Institute
  • 110, 8th Street, Troy, New York 12180
  • Phone (518) 276-8067
  • Fax (518) 276-8715
  • Email roysam_at_ecse.rpi.edu
  • Website http//www.ecse.rpi.edu/roysam
  • Course website http//www.ecse.rpi.edu/roysam/CT
    IA
  • Secretary Laraine Michaelides, JEC 7012, (518)
    276 8525, michal_at_.rpi.edu
  • Grader Ying Chen (cheny9_at_rpi.edu, Office JEC
    6308, 518-276-8207)

Center for Sub-Surface Imaging Sensing
Write a Comment
User Comments (0)
About PowerShow.com