A New Approach To The Multiclass Classification Problem - PowerPoint PPT Presentation

1 / 47
About This Presentation
Title:

A New Approach To The Multiclass Classification Problem

Description:

In practice multi-class classification is done by combining binary classifiers ... Mapped feature vectors that lie inside the hypercone have a distinct class label ... – PowerPoint PPT presentation

Number of Views:255
Avg rating:3.0/5.0
Slides: 48
Provided by: ASM573
Category:

less

Transcript and Presenter's Notes

Title: A New Approach To The Multiclass Classification Problem


1
A New Approach To The Multiclass Classification
Problem
  • Category Vector Space

2
Agenda
  • Problem
  • Motivation
  • Discussion
  • Preliminary Results

3
Problem
Classification Problem
  • Multi-class classification through binary
    classification
  • One-vs-All
  • One-vs-One
  • Multi-class classification can be constructed
    often as a generalization of binary
    classification
  • In practice multi-class classification is done by
    combining binary classifiers

4
Multiclass Applications Large Category Space
Problem
Object recognition
Automated protein classification
Digit recognition
http//www.glue.umd.edu/zhelin/recog.html
Phoneme recognition
300-600
  • The multi-class algorithm computationally
    expensive

Waibel, Hanzawa, Hinton,Shikano, Lang 1989
5
Problem
Other Multiclass Applications
  • Hand-writing recognition (e.g., USPS)
  • Text classification
  • Face detection
  • Face expression recognition

6
Problem
Classification Setup
Training and test data drawn i.i.d. from fixed
but unknown probability distribution D
  • Data (xi,yi) i 1,,n

Labeled training set
Question design a classification rule
y f(x) such that, given a new x,
this predicts y with minimal probability of
error
7
Support Vector Machines (SVMs)
Problem

  • Training examples mapped to
  • (usually high-dimensional)
  • feature space by a feature
  • map F(x) (F1(x), , Fd(x))
  • Learn linear decision boundary
  • Trade-off between maximizing
  • geometric margin of the training
  • data and minimizing margin violations




_
_


_
_
_
_
8
Problem
Definition Of SVM Classifiers
  • Linear classifier defined in feature space by
  • SVM solution gives
  • as a linear combination of support vectors, a
    subset of the training vectors






w
_
_
b


_
_
_
_
9
Problem
Definition Of A Margin
  • History (Vapnik, 1965) if linearly separable
  • Place hyerplane far from the data large margin

10
Problem
Maximize The Margin
  • History (Vapnik, 1965) if linearly separable
  • Place hyerplane far from the data large margin
  • Large margin classifier
  • Leads to good generalization (performance on test
    sets)

11
Problem
Combining Binary Classifiers
  • One-vs-All (OVA)
  • For each class build a classifier for that class
    vs the rest
  • Constructs k SVM models
  • Often very imbalanced classifiers
  • Asymmetry in the amount of training data
  • Earliest implementation for SVM multiclass
  • One-vs-One (OVO)
  • Constructs k(k-1)/2 classifiers
  • Rooted binary SVMs with k leaves
  • Traverse tree to reach leaf node

12
Example 1
Motivation
  • Race categories White, Black, Asian
  • Task Map the image training set to the race
    labels
  • The training (learning)
  • Test (generalization)
  • Scenario Ambiguous test image is presented
  • Mixed race person
  • A person drawn from a race which is not
    represented by the system (i.e. Hispanics, Native
    Americans, etc)
  • No way of assigning a mixed label
  • The system cannot represent the mixed race person
    using a combination of categories
  • No way of representing unknown race
  • Possible Solution
  • Indicate that the incoming image is outside the
    margin of each learned category

13
Example 2
Motivation
  • Musical samples generated by a single instrument
  • Electric guitara set of note categories
    C,C,D,D, etc.
  • Task Map the training set musical notes to the
    labels
  • Reasonable learning and generalization properties
  • Scenario Given musical sequences
  • Intervals (two notes simultaneously struck such
    as C,F )
  • Chords (containing three or more notes)
  • Ambiguity at the training set level
  • Forced to assign new labels to intervals and
    chords even though they contain the same
    featuressingle notesas the note categories
  • Music sequence case, if we learned a conditional
    probability distribution p(L lx)
  • x is a music sequence and L C,C,D, ,B
    is a set of note labels
  • When x is an intervalsay a tritone
  • No way of assigning high probability to the
    tritone
  • Possible Solution
  • Accommodate the tritone by assigning it a new
    label
  • Large label space
  • Truncate because of exponential size
    considerations

14
Problems With Combining Binary Classifiers
Motivation
  • Categories are conceived as nominal labels
  • No underlying geometry for the categories
  • Inability of the conditional distribution to give
    us a measure (value) for interpolated categories
  • Non-represented interpolated categories are left
    out
  • Not easy to distinguish basic categories from
    compound categories

15
Category Vector Spaces Solution
Motivation
  • Invoke the notion of a category vector space
  • Categories are defined with a geometric structure
  • Assume that the set of categories(labels) forms a
    vector space
  • Music sequence would correspond to a label
    in a twelve dimensional vector
    C,C,D,D,E,F,F,G,G,A,A,B
  • Basic note C,C,D etc. would have its own
    coordinate axis (vector space)
  • Learning problem
  • Map the training set music sequences to vectors
    in a 12 dimensional space such that the training
    and test set errors are small
  • Map the training musical sequences to the 12
    dimensional vector space and then (if a support
    vector machine approach is used), maximize the
    margin of the mapped vectors in the category
    space
  • Race classification example is analogous
  • Depends on how many races we wish to explicitly
    represent
  • Map the training set to race category vector
    space and maximize the margin
  • Generalization problem
  • Map a test set musical sequence or image into the
    category space and then ask if it lies within the
    margin of a note (or chord) or race category

Note Extensions to other multi-category learning
applications are straightforward assuming we can
map category labels to coordinate.
16
Multiclass Fisher Related Idea
Discussion
Given the feature vectors
D categories and a projected set of features
defined by the MC-FLD maximizes
where
Solution The columns of W are the top D
eigenvectors (corresponding to the largest
eigenvalues) of
  • Eigenvectors are orthonormal
  • Columns of W constitute a category vector space
  • Interpret as a category space projection
  • Optimal solution is a set of orthogonal weight
    vectors

17
Disadvantage Of Multiclass Fisher
Discussion
  • Avoided this approach since margins are not
    maximized in category space
  • We have not seen a classifier take a three class
    problem with labels 0,1,2, map the input
    features into a vector space
  • Basis vectors , and
  • Attempt to maximize the margin in the category
    vector space
  • Not seen any previous work where a pattern from a
    compound categorysay a combination of labels 1
    and 2is also used in training with a conversion
    of the compound category to a vector

18
Description of Category Vector Spaces
Discussion
  • Input feature vectors are mapped to the category
    vector space using a kernel-based approach
  • In the category vector space, maximizing the
    margin is equivalent to forming hypercones
  • Mapped feature vectors that lie inside the
    hypercone have a distinct class label
  • Mapped vectors that lie in between hypercones are
    ambiguous
  • Hypercones are not allowed to intersect

Depicts basic categories
19
Advantages Of Category Vector Space
Discussion
  • Each pattern now exists as a linear superposition
    of category vectors in the category space.
  • Ensures ambiguity is handled at a fundamental
    level
  • Compound categories can be directly represented
    in the category space
  • Can maximize the compound category margin as well
    as the margins for the basic categories

20
Technical Challenges
Discussion
  • Regression
  • Each input training set feature vector
    must be mapped to a corresponding point
    where M is the number of feature dimensions
    and D the cardinality of the basic categories
  • Classification
  • Each mapped feature vector must maximize its
    margin relative to its own category vector
    against the other category vectors Here
    is known and corresponds to a category vector

21
Regression In Category Space
Discussion
  • controls the width of the interval for which
    there is no penalty
  • Slack variable vectors are non-negative
    component-wise
  • Weight vector and bias help map the feature
    vector to its counterpart.
  • The choice of kernel K (GRBF or otherwise) is
    hidden in the operator which implements inner
    products by projecting vectors in into a
    suitable space
  • The regularization parameter weighs the
    norm of against the data fitting error.
    Larger the value of , the greater the
    emphasis on the data fitting error

subject to the constraints
22
Classification In Category Space
Discussion
  • Associate each mapped vector with a category
    vector
  • Category vectors
  • Can be basis vectors (axes corresponding to basic
    categories) in the category space
  • Ordinary vectors (corresponding to compound
    categories)
  • In this definition of membership, no distinction
    is made between basic and compound categories.
  • We seek to maximize the margin in the category
    space
  • Minimizing the norm of the mapped vectors is
    equivalent to maximizing the margin provided the
    inequalities can be satisfied

subject to the constraints
23
Discussion
Integrated Classification and Regression
Objective Function
  • The design of the objective function is so we can
    obtain an integrated dual classification and
    regression objective

24
Multi-Category GRBF
Preliminary Results
  • Gaussian radial basis function (GRBF) classifier
    with multiple outputs
  • One for each basic category
  • Given a training set of registered and cropped
    face images
  • Labels are White, Black, Asian
  • GRBF classifier to map the input feature vectors
    into the category space
  • Since we know the label of each training set
    pattern we approximate the mapped category space

Solution
25
Experimental Setup
Preliminary Results
  • 45 training images from the Labeled Faces in the
    Wild image database
  • Database contains over 13,000 images that were
    captured using the Viola-Jones face detector
  • Each face has been labeled with the corresponding
    name of the person
  • Of the 5749 people featured in the database
  • 1680 individuals have multiple images with each
    image being unique
  • In the 45 training images, 15 were from each of
    the three races considered
  • 45 images registered to one standard image
    (after first converting them to grayscale) using
    a landmark-based thin-plate spline (TPS)
  • The landmarks used were
  • Three(3) for each eye
  • Two(2) for the nose
  • Two(2) for the two ears (very approximate since
    the ears are often not visible).
  • After registering the images, they were cropped
    and resized to 13090 with the intensity scale
    adjusted to 0,1.
  • Free parameters were set to and
    These were carefully but qualitatively chosen to
    get a good training set separation in category
    space

White Basis y1 1,0,0T Black Basis y2
0,1,0T Asian Basis y3 0,0,1T
26
Race Classification Training Images
Preliminary Results
Training set images Top row Asian, Middle row
Black, Bottom row White
27
Category Space For Training Images
Preliminary Results
Training set images mapped into the category
vector space
28
Race Classification Testing Images
Preliminary Results
  • Test set images Top row Asian, Middle row
    Black, Bottom row White
  • 51 test set images (17 Asian, 16 Black, 18White)
  • Used the weights discovered by the GRBF
    classifier to map the input test set images into
    the
  • category space

29
Category Space Testing Images
Preliminary Results
  • In the graph above we can see the separation in
    the category space

30
Preliminary Results
Pairwise Projection Of Category Space Testing
Images
  • The pairwise separations in the category space
    show an improved visualization
  • One could in fact draw separating boundaries in
    the three pairwise comparisons and obtain an
    overall decision boundary in 3D
  • Pairwise classifications
  • Roughly separate each pair by drawing lines
    through the origin
  • Removing the orthogonal subspace that is not
    being compared against

31
Ambiguity Testing
Preliminary Results
  • Nine ambiguous (from our perspective) faces
  • Wanted to exhibit the tolerance of ambiguity that
    is a hallmark of category spaces
  • The conclusion drawn from the result is a
    subjective one

Ambiguous faces mapped into the category space.
Note how they cluster together.
32
Experiment With MPEG-7 Database
Preliminary Results
Butterfly
Bat
Bird
33
Experiment With MPEG-7 Database
Preliminary Results
Fly
Chicken
Batbird
34
3 Class Training
Preliminary Results
35
3 Class Testing
Preliminary Results
36
4 Class Training
Preliminary Results
37
4 Class Testing
Preliminary Results
38
Summary
  • Fundamental contribution is learning of category
    spaces from patterns
  • Ensures ambiguity is handled at a fundamental
    level
  • Compound categories can be directly represented
    in the category space
  • Specific approach integrates regression and
    classification (iCAR)
  • Combines a regression objective function (map the
    patterns)
  • Maximum margin objective function
  • (perform multicategory classification in
    category space)

39
Questions Discussion
Thank You
40
References
1 H. Guo. Diffeomorphic point matching with
applications in medical image analysis. PhD
thesis, University of Florida, Gainesville, FL,
2005. Ph.D. Thesis. 2 J. Zhang. New information
theoretic distance measures and algorithms for
multimodality image registration. PhD thesis,
University of Florida, Gainesville, FL, 2005.
Ph.D. Thesis. 3 A. A. Kumthekar. Affine image
registration using minimum spanning tree
entropies. Masters thesis, University of
Florida, Gainesville, FL, 2004. M. S. Thesis. 4
A. Rajwade, A. Banerjee, and A. Rangarajan. A new
method of probability density estimation
with application to mutual information-based
image registration. In IEEE Computer Vision and
Pattern Recognition (CVPR), volume 2, pages
17691776, 2006. 5 A. Peter and A. Rangarajan.
A new closed form information metric for shape
analysis. In Medical Image Computing and Computer
Assisted Intervention (MICCAI part 1), Springer
LNCS 4190, pages 249256. 2006. 6 A. S. Roy, A.
Gopinath, and A. Rangarajan. Deformable density
matching for 3D non-rigid registration of shapes.
InMedical Image Computing and Computer Assisted
Intervention (MICCAI part 1), Springer LNCS 4791,
pages 942949. 2007. 7 F.Wang, B. Vemuri, and
A. Rangarajan. Groupwise point pattern
registration using a novel CDF-based Jensen
Shannon divergence. In IEEE Computer Vision and
Pattern Recognition (CVPR), volume 1, pages
12831288, 2006. 8 L. Garcin, A. Rangarajan,
and L. Younes. Non-rigid registration of shapes
via diffeomorphic point matching and clustering.
In IEEE Conf. on Image Processing, volume 5,
pages 32993302, 2004. 9 F. Wang, B.C. Vemuri,
A. Rangarajan, I.M. Schmalfuss, and S.J.
Eisenschenk. Simultaneous nonrigid registration
of multiple point sets and atlas construction. In
European Conference on Computer Vision (ECCV),
pages 551563, 2006. 10 H. Guo, A. Rangarajan,
and S. Joshi. 3D diffeomorphic shape registration
on hippocampal datasets. In James S. Duncan and
Guido Gerig, editors, Medical Image Computing and
Computer Assisted Intervention (MICCAI), pages
984991. 2005.
41
References
11 A. Rangarajan, J. Coughlan, and A. L.
Yuille. A Bayesian network framework for
relational shape matching. In IEEE Intl. Conf.
Computer Vision (ICCV), volume 1, pages 671678,
2003. 12 J. Zhang and A. Rangarajan.
Multimodality image registration using an
extensible information metric and high
dimensional histogramming. In Information
Processing in Medical Imaging, pages
725737, 2005. 13 J. Zhang and A. Rangarajan.
Affine image registration using a new information
metric. In IEEE Computer Vision and Pattern
Recognition (CVPR), volume 1, pages 848855,
2004. 14 J. Zhang and A. Rangarajan. A unified
feature based registration method for
multimodality images. In IEEE International
Symposium on Biomedical Imaging (ISBI), pages
724727, 2004. 15 A. Peter and A. Rangarajan.
Shape matching using the Fisher-Rao Riemannian
metric Unifying shape representation and
deformation. In IEEE International Symposium on
Biomedical Imaging (ISBI), pages 11641167,
2006. 16 A. Rajwade, A. Banerjee, and A.
Rangarajan. Continuous image representations
avoid the histogram binning problem in mutual
information-based registration. In IEEE
International Symposium on Biomedical Imaging
(ISBI), pages 840844, 2006. 17 H. Guo, A.
Rangarajan, S. Joshi, and L. Younes. A new joint
clustering and diffeomorphism estimation algorithm
for non-rigid shape matching. In Chandra
Khambametteu, editor, IEEE CVPR Workshop
on Articulated and Non-rigid motion (ANM), pages
1622. 2004. 18 H. Guo, A. Rangarajan, S.
Joshi, and L. Younes. Non-rigid registration of
shapes via diffeomorphic point matching. In IEEE
Intl. Symposium on Biomedical Imaging (ISBI),
volume 1, pages 924927, 2004. 19 H. Guo, A.
Rangarajan, and S. Joshi. Diffeomorphic point
matching. In N. Paragios, Y. Chen, and O.
Faugeras, editors, The Handbook of Mathematical
Models in Computer Vision, pages
205220. 2005. 20 A. Peter and A. Rangarajan.
Maximum likelihood wavelet density estimation
with applications to image and shape matching.
IEEE Trans. Image Processing, 2007. (accepted
subject to minor revision).
42
References
21 F. Wang, B.C. Vemuri, A. Rangarajan, and
S.J. Eisenschenk. Simultaneous nonrigid
registration of multiple point sets and atlas
construction. IEEE Trans. Pattern Analysis and
Machine Intelligence, 2007. (in press). 22 A.
Peter and A. Rangarajan. Information geometry for
landmark shape analysis Unifying shape
representation and deformation. IEEE Trans.
Pattern Analysis and Machine Intelligence, 2007.
(revised and resubmitted). 23 A. Rajwade, A.
Banerjee, and A. Rangarajan. Probability density
estimation using isocontours and isosurfaces
Applications to information theoretic image
registration. IEEE Trans. Pattern Analysis and
Machine Intelligence, 2007. (under
revision). 24 A. Peter and A. Rangarajan. Shape
LAne Rouge Sliding wavelets for indexing and
retrieval. In IEEE Computer Vision and Pattern
Recognition (CVPR), 2008. (submitted). 25 A.
Rajwade, A. Banerjee, and A. Rangarajan.
Newimage-based density estimators for 3D
intermodality image registration. In IEEE
Computer Vision and Pattern Recognition (CVPR),
2008. (submitted). 26 A. Rangarajan and H.
Chui. Applications of optimizing neural networks
in medical image registration. In Artificial
Neural Networks in Medicine and Biology
(ANNIMAB), Perspectives in neural
computing, pages 99104. Springer, 2000. 27 A.
Rangarajan and H. Chui. A mixed variable
optimization approach to non-rigid image
registration. In Discrete Mathematical Problems
with Medical Applications, volume 55 of DIMACS
series in Discrete Mathematics and Computer
Science, pages 105123. American Mathematical
Society, 2000. 28 H. Chui and A. Rangarajan. A
new algorithm for non-rigid point matching. In
Proceedings of IEEE Conf. on Computer Vision and
Pattern RecognitionCVPR 2000, volume 2, pages
4451. IEEE Press, 2000. 29 H. Chui and A.
Rangarajan. A feature registration framework
using mixture models. In IEEEWorkshop on
Mathematical Methods in Biomedical Image Analysis
(MMBIA), pages 190197. IEEE Press, 2000. 30 H.
Chui, L. Win, J. Duncan, R. Schultz, and A.
Rangarajan. A unified feature registration method
for brain mapping. In Information Processing in
Medical Imaging (IPMI), pages 300314. Springer,
2001
43
References
31 A. Rangarajan. Learning matrix space image
representations. In Energy Minimization Methods
for Computer Vision and Pattern Recognition
(EMMCVPR), Lecture Notes in Computer Science,
LNCS 2134, pages 153168. Springer, New York,
2001. 32 A. Rangarajan, H. Chui, and
E.Mjolsness. A relationship between spline-based
deformable models and weighted graphs in
non-rigid matching. In IEEE Conf. on Computer
Vision and Pattern Recognition (CVPR), pages
I897904. IEEE Press, 2001. 33 H. Chui and A.
Rangarajan. Learning an atlas from unlabeled
point-sets. In IEEE Workshop on
Mathematical Methods in Biomedical Image Analysis
(MMBIA), pages 5865. IEEE Press, 2001. 34 H.
Chui and A. Rangarajan. A new joint point
clustering and matching algorithm for estimating
nonrigid deformations. In Intl. Conf. on
Mathematics and Engineering Techniques in
Medicine and Biological Sciences (METMBS), pages
I309315. CSREA Press, 2002. 35 A. Rangarajan
and A. L. Yuille. MIME Mutual information
minimization and entropy maximization for
Bayesian belief propagation. In T. G. Dietterich,
S. Becker, and Z. Ghahramani, editors,
Advances in Neural Information Processing Systems
14, pages 873880, Cambridge, MA, 2002. MIT
Press. 36 A. L. Yuille and A. Rangarajan. The
Concave Convex procedure (CCCP). In T. G.
Dietterich, S. Becker, and Z. Ghahramani,
editors, Advances in Neural Information
Processing Systems 14, pages 10331040, Cambridge,
MA, 2002. MIT Press. 37 H. Chui, L. Win, J.
Duncan, R. Schultz, and A. Rangarajan. A unified
non-rigid feature registration method for brain
mapping. Medical Image Analysis, 7(2)113130,
2003. 38 H. Chui and A. Rangarajan. A new point
matching algorithm for non-rigid registration.
Computer Vision and Image Understanding,
89(2-3)114141, 2003. 39 A. L. Yuille and A.
Rangarajan. The Concave-Convex procedure (CCCP).
Neural Computation, 15915936, 2003. 40 H.
Chui, A. Rangarajan, J. Zhang, and C.M. Leonard.
Unsupervised learning of an atlas from
unlabeled point-sets. IEEE Trans. Pattern
Analysis and Machine Intelligence, 26(2)160172,
2004. 41 P. Gardenfors. Conceptual spaces The
geometry of thought. MIT Press, 2000. 42 J. C.
Platt, N. Cristianini, and J. Shawe-Taylor. Large
margin DAGs for multiclass classification.
In Advances in Neural Information Processing
Systems (NIPS), volume 12, pages 547553. MIT
Press, 2000.
44
References
43 Y. Lee, Y. Lin, and G. Wahba. Multicategory
support vector machines, theory, and application
to the classification of microarray data and
satellite radiance data. Journal of the American
Statistical Association, 996781, 2004. 44
C.-W. Hsu and C.-J. Lin. A comparison of methods
for multiclass support vector machines.
IEEE Trans. Neural Networks, 13(2)415425,
2002. 45 T. Kolb. Music theory for guitarists
Everything you ever wanted to know but were
afraid to ask. Hal Leonard, 2005. 46 K.
Fukunaga. Introduction to Statistical Pattern
Recognition. Academic Press (second edition),
1990. 47 S. Mika, G. Ratsch, and K.-R. Muller.
A mathematical programming approach to the kernel
fisher algorithm. In T. K. Leen, T. G.
Dietterich, and V. Tresp, editors, Advances in
Neural Information Processing Systems 13, pages
591597. MIT Press, 2001. 48 D. Widdows.
Geometry and Meaning. Center for the Study of
Language and Information, 2004. 49 T. Jebara.
Machine Learning Discriminative and Generative.
Kluwer Academic Publishers, 2003. 50 V. Vapnik.
Statistical Learning Theory. Wiley Interscience,
1998. 51 B. Scholkopf, A. Smola, R. C.
Williamson, and P. L. Bartlett. New support
vector algorithms. Neural Computation,
12(5)12071245, 2000. 52 M. E. Tipping. Sparse
Bayesian learning and the relevance vector
machine. Journal of Machine Learning Research,
1211244, 2001. 53 U. Kressel. Pairwise
classification and support vector machines. In
Advances in Kernel Methods - Support Vector
Learning, pages 255268. MIT Press, 1999. 54 C.
M. Bishop. Pattern recognition and machine
learning. Springer, 2006. 55 J. Weston and C.
Watkins. Multi-class support vector machines.
Technical Report CSD-TR-98-04, Department of
Computer Science, Royal Holloway, University of
London, 1998. 56 E. L. Allwein, R. E. Schapire,
and Y. Singer. Reducing multiclass to binary a
unifying approach for margin classifiers. Journal
of Machine Learning Research, 1113141,
2001. 57 J. C. Platt. Fast training of support
vector machines using sequential minimal
optimization. In Advances in Kernel Methods -
Support Vector Learning, pages 185208. MIT
Press, 1999.
45
References
58 L. Kaufman. Solving the quadratic
programming problem arising in support vector
classification. In B. Schölkopf, C. Burges, and
A. Smola, editors, Advances in Kernel Methods -
Support Vector Learning, pages 147168. MIT
Press, 1999. 59 O. L. Mangasarian and D. R.
Musicant. Lagrangian support vector machines.
Journal of Machine Learning Research,
1(3)161177, 2001. 60 G. M. Fung and O. L.
Mangasarian. A feature selection Newton method
for support vector machine classification.
Computational Optimization and Applications,
281852002, 2004. 61 T. Joachims. Making
large-scale SVM learning practical. In B.
Schölkopf, C. Burges, and A. Smola, editors,
Advances in Kernel Methods - Support Vector
Learning, pages 169184. MIT Press, 1999. 62 K.
Crammer and Y. Singer. On the algorithmic
implementation of multiclass kernel-based vector
machine. Journal of Machine Learning Research,
2(2)265292, Springer 2002. 63 J. A. K.
Suykens and J. Vandewalle. Multiclass least
squares support vector machines. In
International Joint Conference on Neural
Networks, volume 2, pages 900903, 1999. 64 T.
Joachims. Training linear SVMs in linear time. In
Proceedings of the 12th ACM SIGKDD
international conference on Knowledge discovery
and data mining, volume 12, pages 217226,
2006. 65 G. B. Huang, M. Ramesh, T. Berg, and
E. Learned-Miller. Labeled faces in the wild A
database for studying face recognition in
unconstrained environments. Technical Report
07-49, University of Massachusetts, Amherst,
October 2007. Available at http//vis-www.cs.umass
.edu/lfw. 66 P. Viola and M. Jones. Rapid
object detection using a boosted cascade of
simple features. In IEEE Computer Vision and
Pattern Recognition (CVPR), volume 1, pages
511518, 2001. 67 G. Wahba. Spline models for
observational data. SIAM, Philadelphia, PA,
1990. 68 F. L. Bookstein. Principal warps
Thin-plate splines and the decomposition of
deformations. IEEE Trans. Patt. Anal. Mach.
Intell., 11(6)567585, June 1989. 69 S.
Ramaswamy, P. Tamayo, R. Rifkin, S. Mukherjee,
C.-H. Yeang, M. Angelo, C. Ladd, M. Reich, E.
Latulippe, J. P. Mesirov, T. Poggio, W. Gerald,
M. Lodadagger, E. S. Lander, and T. R. Golub.
Multiclass cancer diagnosis using tumor gene
expression signatures. Proceedings of the
National Academy of Sciences (PNAS),
98(26)1514915154, 2001.
46
References
70 D. Lowe. Object recognition from local
scale-invariant features. In IEEE International
Conference on Computer Vision (ICCV), volume 2,
pages 11501157, 1999. 71 M. E. Tipping and C.
M. Bishop. Mixtures of probabilistic principal
component analyzers. Neural Computation,
11(2)443482, 1999. 72 M. A. O. Vasilescu and
D. Terzopoulos. Multilinear Image Analysis for
Facial Recognition. In ICPR (2), pages 511514,
2002. 73 X. He, D. Cai, H. Liu, and J. Han.
Image clustering with tensor representation. In
Zhang H., Chua T., Steinmetz R., Kankanhalli M.
S., and Wilcox L., editors, ACM Multimedia, pages
132140. ACM, 2005. 74 J. B. MacQueen. Some
Methods for classification and Analysis of
Multivariate Observations. In Proceedings of 5th
Berkeley Symposium on Mathematical Statistics and
Probability, volume 1, pages 281297. University
of California Press, 1967. 75 D. Titterington,
A. Smith, and U. Makov. Statistical Analysis of
Finite Mixture Distributions. John Wiley Sons,
1985. 76 J. Pearl. Probabilistic Reasoning in
Intelligent Systems Networks of Plausible
Inference. Morgan Kaufmann, September 1988. 77
X. He, D. Cai, and P. Niyogi. Tensor Subspace
Analysis. InWeiss Y., Schölkopf B., and Platt J.,
editors, Advances in Neural Information
Processing Systems 18, pages 499506. MIT Press,
Cambridge, MA, 2006. 78 R. J. Hathaway. Another
interpretation of the EM algorithm for mixture
distributions. Statistics and Probability
Letters, 45356, 1986. 79 R. M. Neal and G. E.
Hinton. A view of the EM algorithm that justifies
incremental, sparse, and other variants. In
Jordan M. I., editor, Learning in Graphical
Models, pages 355370. Kluwer, 1998. 80 A. L.
Yuille and J. J. Kosowsky. Statistical physics
algorithms that converge. Neural
Computation, 6(3)341356, May 1994. 81 A. L.
Yuille, P. Stolorz, and J. Utans. Statistical
physics, mixtures of distributions, and the EM
algorithm. Neural Computation, 6(2)334340,
March 1994.
47
References
82 B. Leibe and B. Schiele. Analyzing
appearance and contour based methods for object
categorization. In IEEE Conference on Computer
Vision and Pattern Recognition (CVPR), volume 2,
pages 409415, Madison, WI, June 2003. 83 G.
Griffin, A. Holub, and P. Perona. CalTech 256
object category dataset. Technical Report
CNS-TR- 2007-001, Calif. Inst. of Tech., 2007.
Write a Comment
User Comments (0)
About PowerShow.com