Title: Figure 10: Input output pairs for a test input using fixed input-length HMMs. Instances of the same shape with different length encodings are indicated with different colors.
1HMM-Based Efficient Sketch Recognition
Tevfik Metin Sezgin and Randall Davis Computer
Science and Artificial Intelligence Laboratory
Massachusetts Institute of Technology mtsezgin,
davis_at_csail.mit.edu
ABSTRACT Current sketch recognition systems
treat sketches as images or a collection of
strokes, rather than viewing sketching as an
interactive and incremental process. We show how
viewing sketching as an interactive process
allows us to recognize sketches using Hidden
Markov Models. We report results of a user study
indicating that in certain domains people draw
objects using consistent stroke orderings. We
show how this consistency, when present, can be
used to perform sketch recognition efficiently.
This novel approach enables us to have polynomial
time algorithms for sketch recognition and
segmentation, unlike conventional methods with
exponential complexity. SKETCH RECOGNITION
EXAMPLE OF SYSTEMS PROCESSING
System Architecture
System Evaluation
- EVALUATION SETUP Two setups were evaluated
-
- Fixed input length setup
- One HMM per input length
- Variable input length setup
- One HMM per class
- Explicit end states in the HMMs
- EFFECTS of NOISE We ran a set of tests to
measure the sensitivity of our method to errors.
We considered two sources of errors -
- Spurious strokes
- Low level recognition errors
- For each kind of error, we measured the extent of
the misrecognition neighborhood, defined by the
number of misrecognized objects immediately
before or after the source of error.
- Figure 4 HMMs are trained using encodings of
collected data. Statistical models of users
preferred drawing orders are used to interpret a
given scene. - ENCODING Strokes are converted into discrete
symbols using our Early Sketch Processing
Toolkit. The encoding consists of 13 discrete
symbols - Horizontal, vertical, positively/negatively
sloped lines - Circles
- Horizontal, vertical ovals
- Polylines with 2, 3, 4, 5 edges
- Complex shapes (combination of lines and curves)
- Figure 10 Input output pairs for a test input
using fixed input-length HMMs. Instances of the
same shape with different length encodings are
indicated with different colors. - CONTRIBUTIONS and RELATED WORK
- Related work includes sketch recognition systems
and work in hand-writing recognition. Our system
has the following properties that collectively
distinguish our work apart from this related
work. - We use the natural consistencies in sketching to
aid recognition and dont force users to sketch
in a specific way - Our system allows multiple stroke object
recognition - Our system runs in polynomial time
- Our framework supports incremental recognition
- HMM based recognition complements model based
recognition - Our approach is robust in the face of spurious
strokes - Unlike chain-code representations used by
handwriting and gesture recognition, we use
higher level features.
Figure 8 Effects of spurious strokes on the
recognition errors for fixed and variable input
length HMMs.
Figure 5 Encodings of training data (left) and a
scene with two objects (right). SCENE
INTERPRETATION Tabulate matching scores
Figure 9 Effects of low-level recognition errors
on the recognition errors for fixed and variable
input length HMMs.
SCALABILITY We also measured how the recognition
performance scales with respect to the number of
primitives in the scene.
SEGMENTATION RECOGNITION Find globally coherent
interpretation
(a)
Figure 10 The performance of our system compared
to a baseline system that employs combinatoric
matching of scene elements to object models. As
seen here, our method scales well for increasing
number of objects in the scene,
(b)
Figure 3 Our analysis of user sketching behavior
included constructing drawing style diagrams (a),
and comparing the number of different drawing
orders used by the subjects to the maximum number
of ways each object could be sketched (b).