Title: Pr
1A Generic Approach for Image Classification Based
on Decision Tree Ensembles and Local Sub-windows
Raphaël Marée, Pierre Geurts, Justus Piater,
Louis Wehenkel University of Liège,
Belgium maree_at_montefiore.ulg.ac.be
- Problem
- Many application domains require classification
of characters, symbols, faces, 3D objects,
textures, - Specific feature extraction methods must be
manually adapted when considering a new
application - Approach
- Recent and generic ML algorithm based on
decision tree ensembles and working directly on
pixel values - Extension with local sub-window extraction
-
- Results
- Competitive with the state of the art on four
well known datasets MNIST, ORL, COIL-100, OUTEX - Encouraging results for robustness
(generalisation, rotation, scaling, occlusion)
Abstract
2Image classification
- Many different kind of problems
- Usually tackled using
- Problem-specific feature extraction
- ie. extracting a reduced set of interesting
features from the initially huge number of
pixels -
- Learning or matching algorithm
- Our generic approach
- Working directly on pixel values ie. without
any feature extraction ie. images are described
by integer values (grey or RGB intensities)
of all pixels -
- Ensemble of decision trees
3Global generic approach
- Ensemble of extremely randomized trees
(extra-trees) - Learning
- Top-down induction algorithm like classical
decision tree (with tests at the internal nodes
of the form ak,l lt ath that compare the value
of the pixel at position (k,l) to a threshold
ath) but - Test attributes and thresholds in internal nodes
are chosen randomly, - Each tree is fully developed until it perfectly
classifies images in the learning sample, - Several extra-trees are built from the same
learning sample. - Testing
- Propagate the entire test image successively into
all the trees (involves comparing pixel values to
thresholds in test nodes) and assign to the image
the majority class among the classes given by the
trees.
4Local generic approach
- Extra-trees and Sub-windows
- Learning
- Given a window size w1 x w2 and a large number
Nw - Extract Nw sub-windows at random from learning
set images and assign to each sub-window the
classification of its parent image - Build a model to classify these Nw sub-windows
by using the w1 x w2 pixel values that
characterize them - Testing
- Given the window size w1 x w2
- Extract all possible sub-windows of size w1 x w2
from test image - Apply the model on each sub-window
- Assign to the image the majority class among the
classes assigned to the sub-windows by the model
5Experiments description
- Database specificationEvery image in each
database is described by all its pixel values and
belong to one class.
DBs images features classes
MNIST 70000 784 (28x28x1) 10
ORL 400 10304 (92x112x1) 40
COIL-100 7200 3072 (32x32x3) 100
OUTEX 864 49152 (128x128x3) 54
- Database protocolsSeparation of each database
in two independent sets the learning set (LS) of
pre-classified images used to build a model and
the test set (TS) used to evaluate the model.
- MNIST
- LS first 60000 images
- TS last 10000 remaining images
- ORL
- 100 random runs
- LS 200 images
- TS 200 remaining images
- COIL-100
- LS 1800 images (k20, k0..17)
- TS 5400 remaining images
- OUTEX
- LS 432 images
- TS 432 remaining images
6Experiments results
DBs Extra-trees Extra-trees Sub-windows State-of-the-art
MNIST 3.26 2.63(w1w224) 12 0.7 1
ORL 4.56 1.43 2.13 1.18(w1w232) 7.5 0 2
COIL-100 2.04 0.39(w1w216) 12.5 0.1 3
OUTEX 64.35 2.78(w1w24) 9.5 0.2 4
1 Y. LeCun and L. Bottou and Y. Bengio and P.
Haffner, Gradient-based learning applied to
document recognition, 1998 2 R. Paredes and A.
Perez-Cortes, Local representations and a direct
voting scheme for face recognition, 2001 3 S.
Obrzalek and J. Matas, Object Recognition using
Local Affine Frames on Distinguished Regions,
2002 4 T. Mäenpää, M. Pietikäinen, and J.
Viertola, Separating color and pattern
information for color texture discrimination, 2002
- Computing times
- Learning on OUTEX
- Extra-trees 5 sec
- Extra-trees Sub-Windows 8min
- Testing on OUTEX (one image)
- Extra-trees lt 1 msec
- Extra-trees Sub-Windows 0,6 sec
7Evaluation of Robustness
- Generalisation
- Rotation
- Scaling
- Occlusion
Considering different learning sample sizes
(COIL-100)
Image-plane rotation of the test images (COIL-100)
Scaled version of the test images, with model
built from 32x32 images (COIL-100)
Erasing right parts of the test images (COIL-100)
8Conclusion
- Novel, generic, and simple method
- Competitive accuracy
- Our local generic method (Extra-trees
Sub-windows) is close to state-of-the-art methods
without any problem-specific feature extraction
but still slightly inferior to best results - In practice, is it necessary to develop specific
methods to have a slightly better accuracy ? - Invariance
- Robustness to small transformations in test
images - Local approach more robust than global approach
(many local feature vectors are left more or less
intact by a given image transformation)
Future work directions
- Improving robustness
- Augmenting the learning sample with transformed
versions of the original images - Normalization of sub-window sizes and
orientations - Speed/accuracy trade-off for prediction
- Combining Sub-windows with other Machine
Learning algorithms