Title: Pattern representation
1Pattern representation the future of pattern
recognition
Lev Goldfarb ETS group Faculty of Computer
Science UNB Fredericton, Canada
2Outline
- The wisdom of modern physicists
(3
slides) - The maturity of a science
(1 slide) - The currently prevailing wisdom in our field
(1 slide) - Why should this be the guiding wisdom?
(4 slides) - Are we mature enough for the task?
(2
slides) - The (social) reason for the status quo
(1
slide) - The forgotten history syntactic pattern
recognition (8 slides) - Syntactic pattern recognition the unrealized
hopes (4 slides) - How should we apply the wisdom of physicists?
(2 slides) - ETS formalism its inspiration
(2
slides) - ETS formalism temporal information
(3 slides) - ETS formalism
(13 slides) - ETS formalism representational completeness
(1 slide) - ETS formalism the intelligent process
(1 slide) - Learning without representation?
(2
slides) - Conclusion
(3 slides)
3 The wisdom of modern physicists
From Freeman Dyson, Innovations in Physics,
Scientific American, September 1958
A few months ago two of the great
historical figures of European physics, Werner
Heisenberg and Wolfgang Pauli, believed that they
had made an essential step forward in the
direction of a theory of elementary particles.
Pauli happened to be passing through New York,
and was prevailed upon to give a lecture
explaining the new ideas to an audience that
included Niels Bohr, who had been mentor to both
. . . in their days of glory thirty years earlier
when they made their great discoveries. Pauli
spoke for an hour, and then there was a general
discussion during which he was criticized sharply
by the younger generation. Finally, Bohr was
called on to make a speech summing up the
argument. We are all agreed, he said, that
your theory is crazy. The question which divides
us is whether it is crazy enough to have a chance
of being correct. My own feeling is that it is
not crazy enough.
4The wisdom of modern physicists
(the quote continues)
The objection that they are not crazy
enough applies to all the attempts which have so
far been launched at a radically new theory of
elementary particles. It applies especially to
crackpots. Most of the crackpot papers that are
submitted to the Physical Review are rejected,
not because it is impossible to understand them,
but because it is possible. Those that are
impossible to understand are usually published.
When the great innovation appears, it will almost
certainly be in a muddled, incomplete, and
confusing form. To the discoverer himself it
will be only half-understood. To everybody else
it will be a mystery. For any speculation that
does not at first glance look crazy, there is no
hope.
5The wisdom of modern physicists
Why did two of the 20th century leading
physicists behave so childishly? Their
wisdom is this based on the past experience, a
radical novelty of the proposed physical model
(of a unified field theory) is a necessary
prerequisite for it to be a serious contender.
6The maturity of a science
Why did I begin the talk with the above quote?
I want to draw your attention to one very
important informal fact the maturity of a
science is reflected in the ability of its
practitioners to estimate the quality of the
match between the reality and its
model. This is one of the main messages I would
like you to keep in mind, and I hope we will
discuss it in this workshop. How is our field
doing in this respect?
7The currently prevalent wisdom in our field
- The main (subconscious?) postulate Completely
rely on statistical models (and, therefore, on
the vector space formalism). - In accordance with the above postulate, expect
that there exist some new, statistically
profound models/algorithms that would do a
satisfactory job. - Innovative, by taking an apparently structural
representation, convert it to a numeric one (to
tame its structural elements), and reduce the
problem to the more familiar statistical setting
(in the process, misleading yourself and others
that one is actually dealing with the structural
representation).
8 Why should this be the guiding wisdom?
Indeed, why should we confine ourselves to the
statistical framework? In their 1974 book,
similar (however somewhat rhetorical) doubts were
expressed by two of the present leaders in the
field of statistical pattern recognition, Vapnik
and Chervonenkis In those days it appeared
that the pattern recognition problem carried
within itself the beginnings of some new idea,
which is in no way based on the system of old
concepts researchers wanted to find new
formulations, not to reduce the problem to
already known mathematical schemes. In this
sense the reduction of the pattern recognition
problem to the scheme of average risk
minimization rouses some disappointment. True,
there are attempts to understand the problem in a
more complex formulation . . . . As yet,
however, such attempts are extremely rare.
9Why should this be the guiding wisdom?
(the quote continues)
  Now, many years after the period of
pattern recognition romantics, it is
difficult to estimate what this problem
formalization brought. It is possible that
the desire to find a rigorous formulation
led scientists to restrict the meaningful problem
solution of which was attempted at the
beginning of the 60s.
10Why should this be the guiding wisdom?
From the preface of Probability, Statistics and
Truth (1957), by one of the 20th century
pioneers of modern probability and statistics,
Richard von Mises The stated purpose of
these mentioned earlier investigations is to
create a theory of induction or inductive
logic. According to the basic viewpoint
of this book, the theory of probability in its
application to reality is itself an inductive
science its results and formulas cannot
serve to found the inductive processes as such .
. ..
11Why should this be the guiding wisdom?
From A. N. Kolmogorov, Logical basis for
information theory and probability theory, IEEE
IT-14 (1968) (one of the founders of modern
probability theory) The proceeding rather
superficial discourse should prove two general
theses (1) Basic information theory
concepts must and can be founded without
recourse to the probability theory . . . .
(2) Introduced in this manner, information
theory concepts can form the basis
of the concept random, which would then
naturally suggest that the random is
the absence of periodicity.
12 Are we mature enough for the task?
Why are the present day physicists feel compelled
to venture (and on a big scale) into such highly
speculative theories as string theories, while
we are infatuated with the good old statistics
that simply cannot address the qualitative side
of modeling, i.e. the structure of the model
itself? Do we understand that an adequate
modeling of information processes cannot succeed
in the same manner as, for example, the modeling
of a flight has succeeded (i.e. without capturing
the essence of the corresponding biological
processes)? In particular, do we understand
that without producing reasonably good models of
the information processes in nature we will not
succeed in developing satisfactory information
systems?
13Are we mature enough for the task?
Are we mature enough for the task? ? God
forbid if we are not from the very beginning
of our science, we are faced with modeling of
much more abstract processes then physicists have
ever been gt when modeling information
processes, we need even greater imagination than
physicists do (who, as I mentioned above, are
ahead of us in many ways). It seems quite
obvious to me that without some radically new
insights we are not going to get to any promised
land.
14The (social) reason for the status quo
The above prevalent wisdom has not always been as
popular as it is today. One of the main reasons
for the status quo is the forgotten part of our
history, due to the emergence during the last
15-20 years of two new popular areas, neural
networks and machine learning. Both of them are
dealing with the same subject matter as pattern
recognition however starting, basically, all over
again, and eventually rediscovering the
importance of symbolic representations. (In
contrast to pattern recognition, the professional
milieu is not any more engineering, but
psychological and computational/statistical,
respectively, although both of them attracted
many young physicists.)
15 The forgotten history syntactic pattern
recognition
- In North America, one of the few early general
texts on pattern recognition, Pattern Recognition
Principles (1974), by Tou and Gonzalez, had the
last chapter (chapter 8) titled Syntactic
Pattern Recognition and considered structural
pattern representations. - Among English books that came out in the 70s
and 80s and devoted entirely to this topic, we
had those by Fu (Syntactic Pattern Recognition
and Applications), Grenander (Lectures in
Pattern Theory), Gonzalez and Thomason
(Syntactic Pattern Recognition), Watanabe
(Pattern Recognition) and several others.
16The forgotten history syntactic pattern
recognition
In the resulting excitement and during the making
of so many careers in the above two new sister
areas, some of the important lessons learned in
syntactic/structural pattern recognition were
lost, i.e. the critical role of (non-vector)
pattern representations and formalisms was
overlooked.
17The forgotten history syntactic pattern
recognition
- Syntactic pattern recognition
- Pioneers Eden, Narasimhan, and Ledley
(published their initial work in the early 60s)
and others - King-Sun Fu (of Purdue university, also
instrumental in the founding of IAPR and was its
first president) mounted a productive and
influential applied scientific program to shift
emphasis from the vector space based
representation to other, structural, forms of
representation, predominantly those associated
with formal grammars and its various
generalizations. - Fu began his career in statistical pattern
recognition later, in the 70s and early 80s,
he was largely responsible for the creation of a
burgeoning subfield of syntactic pattern
recognition, and his untimely death in 1985 had a
big impact on the vitality of this subfield.
18(No Transcript)
19The forgotten history syntactic pattern
recognition
Narasimhan (1964) The aim of any adequate
recognition procedure should not be merely to
arrive at a yes, no, dont know decision
but to produce a structural description of the
input picture.
20The forgotten history syntactic pattern
recognition
There are applications of syntactic pattern
recognition to almost any field, from seismic oil
exploration to speech recognition, from face
recognition to fingerprint recognition.
21The forgotten history syntactic pattern
recognition
The main overlooked lesson from syntactic pattern
recognition (already noted by its pioneers) is
this even in this incomplete form, structural
pattern and class representations have
substantial advantages over their vector space
counterparts, from both applied and theoretical
points of view. (However, see slides 23-26).
22Compared to the vector space representation of
the digitized image, under symbolic
representation, one moves immediately into a more
meaningful, higher level representation, with a
generative class description.
23 Syntactic pattern recognition the unrealized
hopes
What is the problem, then? Why have not these
advantages materialized yet in a more apparent
manner? ______________________________ Fundament
al inadequacy of the (conventional) string
representation Take a string
afdbaaccbdfaddbbcacbffacda gt no temporal
information is represented in it (i.e. how the
string was formed) gt exponentially many
candidate operations to consider that could have
been involved in the generation of the string
24Syntactic pattern recognition the unrealized
hopes
gt given a training set of strings, the
inductive learning process simply cannot recover
reliably the set of generative operations, i.e.
to recover the class description gt basic
inadequacy of the underlying formal structure of
the conventional syntactic (similarly all
computational) formalisms the link between the
class and object representations is too week.
25Syntactic pattern recognition the unrealized
hopes
Thus, the conventional string is not an
adequate/reliable form of representation there
are just too many formative object histories that
are hidden behind this representation. (The
related observation applies to graphs and various
numeric representations.)
26A somewhat obvious problem?which is a consequence
of the above fundamental inadequacy?is the
presence of the second (spurious) alphabet of
the non-terminals.
27We need to be wise
About 2500 years ago Democritus wrote Fools
can learn from their own experience the wise
learn from the experience of others.
? ____________ So, lets try to be wise and
learn as much as we can from the experience of
physicists, mathematicians, and biologists.
28 So how should we apply the wisdom of
physicists?
Going back to slide 5, since an incremental
wisdom has not really worked for our field, how
should we interpret a radical novelty for our
needs? I suggest that we should interpret it
in two (equally important) ways, both pointing
towards radically new forms of representation.
29How should we apply the wisdom of mathematicians?
- First why the representation?
-
- If we interpret correctly, from the applied point
of view, the wisdom of modern mathematics, we
would immediately accept that form the
representational point - the data operations that are not
derivatives/compositions of the basic operations
(specified by the underlying axiomatic structure
of the data space) cannot be inductively
recovered/discovered.
30How should we apply the wisdom of physicists?
Second
- we should demand from the model a radical
explanatory novelty
we should expect it to offer some basic insights
into the nature of information processes in the
Universe
- we should demand radical novelty in its formal
structure -
- we should expect it to embody a radically new
formal structure
31ETS formalism its inspiration
From the very beginning, the ETS framework has
been inspired by the formal/esthetical beauty and
power of a dynamic (and generalized) version of
the generative grammar model to support an
evolving concept of class, one needs an evolving
set of transformations that captures the class
description and also modifies the corresponding
(evolving) mathematical structure on the
representation space. (In that sense, if
besides ETS there is another formal realization
of this vision, it should definitely be
investigated.)
32ETS formalism its inspiration
In mathematics, so far, we have been dealing with
various static (abstract) structures. For
example, in group theory, which does study the
subgroup lattice of a given group, there are,
quite naturally, no expectations that one
subgroup is obtained by modifying another
one. Even in a more continuous setting of a
topological space, there are again no
expectations that a topological structure itself
is evolving. ----------------------- In contrast,
in ETS formalism, some of the central building
blocks of the formal structure, the set of
transformations, are being modified on the basis
of the inductive experience.
33 ETS formalism temporal information
Thus, it should not come as a surprise that, when
we came to the formalization stage about 5 years
ago, we had to begin literally from scratch.
The main difficulties have been (and will
continue to be) associated with the need to
introduce temporal information into a
structural representation, i.e. with the concept
of objects formative/generative structural
history. And it is precisely this feature that
characterizes the radical departure from all
known mathematical paradigms.
34ETS formalism temporal information
Event environment versus object environment In
State 1, three unbonded oxygen atoms are shown.
After the first real event has occurred, OA and
OB become bonded, and the corresponding ideal
event (primitive p1) is depicted on the right.
35ETS formalism temporal information
From Edward Witten, Universe on a String,
Astronomy, June 2002
Note how one event (particle on the left or
string on the right) is immediately followed by
two events (two particles/strings).
36 ETS formalism (class) primitive
transformations
initial sights
time
terminal sights
- Think of a primitive as an elementary
process that transforms the initial objects
into terminal ones it is a symbolic notation
of a typically nontrivial process (structured
event). - The circle and the square denote two site
types letters a, b and x, y are names of the
variables that are allowed to vary over
non-overlapping sets of numeric labels. - Brackets signify that we are, in fact,
dealing with a class of (original) primitives,
where each original primitive carries concrete
numeric labels.
37ETS formalism structs (segments of formative
history)
number 3
representations of a more general structural
object
Each pi denotes an ETS primitive transformation
(the order in which the primitive transformations
are applied is captured in the representation).
38ETS formalism extructs (contexts)
- Examples of extructs heavy lines identify the
interface sites and crosses identify detached
sites. - Contexts should be thought of as parts of the
formative history that are necessary for the
presence of the (immediately following)
important segments of history.
39ETS formalism transformation
context
body
body
context
The assembled transform
Formal definition
40ETS formalism a supertransformation
A supertransform, t (tau bold), is a
generalization of the concept of transformation,
and it can be thought of as an abstraction of
the set of several closely related and
inductively acquired transforms. Here, all
contexts have the same interface sites and all
bodies have the same initial and terminal sites.
41ETS formalism class supertransform (structural
class representation)
The class supertransform, t , is obtained on
the basis of the supertransform, by abstracting
away the supertransforms site labels.
42ETS formalism (single level) class representation
- Class representation (associated with a class
supertransform t ) is defined as a pair - CLASS t ( t , CBt ) ,
- where CBt is the context-body association
strength scheme, or simply class
weight scheme - CBt t t from t ? R .
- (Obviously, t is the main, structural, part
of the representation.)
43ETS formalism (structural) description of a
single representational level
Transformation system TS is simply a finite set
of class supertransforms
TS t1, t2, . . . , tm .
44ETS formalism transition to the next level (a
tentative form)
For each class supertransform in a transformation
system, we choose a canonical supertransform
(shown on the left) and construct the
corresponding next-level primitive (shown on the
right).
45ETS formalism transition to the next level
Simplified multi-level ETS representation with
different time scales for each level. Two
consecutive levels are shown. The time scale for
the higher level is measured in coarser units
t0 corresponds to t0 ,
t1 corresponds to t2 , t2 corresponds
to t5 . The contexts of the
transformations are not identified.
46ETS formalism multi-level view
A multi-level representational tower with a
single-level sensor at level 0.
47ETS formalism multi-level view of class
representation
Pyramid view (partial) of a k-th level class
supertransform the pyramid is formed by
all subordinate class supertransforms.
48ETS model basics the evolution of a class
Since any class is specified by a finite set of
weighted k-th level (for some k) transformations,
the class evolution is readily understood via
modification of the set of transformations
(structural change) and/or their weights
(quantitative change). And this is exactly what
you will observe in the functioning of the ETS
intelligent process, discussed in the next talk.
49The proceedings cover page
This is Metamorphosis III by Escher, which was
chosen for the cover page of the proceedings as
intimating an evolution of a class.
50ETS formalism representational completeness
A most distinguishing feature of this formalism
is unprecedented representational completeness
and explicitness. This representational
completeness radically changes the formal side of
the modeling (the corresponding future
mathematics).
51ETS formalism the intelligent process
- In particular, in pattern recognition, the nature
of basic algorithms changes radically, as you
will see from the next talk the processing is
now basically concerned with a careful
(algorithmic) examination of the input data
and recording of the results of such
examination in the ETS language. - This is the job of the intelligent process
(which includes the learning and recognition
stages) - the modification of the structural memory (at
various levels), i.e. of the class
supertransforms, and occasionally, introduction
of new levels -
- the modification of the numeric memory (at
various levels), which is needed, at present, to
record the statistics related to various observed
recurring associations (between various
primitives as well as between the contexts and
the bodies).
52 Learning without representation?
- (See also Section 6 of my introductory paper in
the proceedings). - Within the vector space (VS) formalism, because
of millennia old tradition, it appears that
representation is easy but learning is
difficult. - Within the ETS framework, the present
expectation is that representation is difficult
but learning is easy. (Not quite true.)
53Learning without representation?
- However, since at the end of the VS leaning
process we have - no reliable/meaningful object or class
representations - gt no transferable knowledge about the
class objects is gained - and as a consequence
- the results of (carefully crafted) VS learning
algorithms can hardly be used for any information
processing needs other than classification, - it is very misleading to call such process
inductive learning. - After all, induction is the only candidate we
have for the central intelligent process.
54About our resources
Currently, the (financial) resources of our ETS
group are absolutely negligible, which is why it
is not surprising, given the scale of the
undertaking, that the more substantial
applications of the ETS formalism are still
ahead.
55Inductive informatics
Nevertheless, we are an optimistic lot and have
big plans ? we even introduced the term
inductive informatics for the new science
emerging around the development and various
applications of the ETS formalism (this is to
honor the ideas of the true prophet of
modern science, Francis Bacon) moreover, on the
philosophical side, as many scientists and
non-scientists have noted, numeric formalisms
have failed to reveal the unity of nature
so we believe that, as a general symbolic
formalism, ETS, when developed, promises to fit
the bill.
56 Conclusion
Going back to slide 6, I would like you to keep
in mind the question raised there. How mature
are we as a science are we capable of estimating
the quality of the match between the (inductive)
information processes as they exist in nature and
our models of them? Lets come back to this
question periodically and during the workshops
concluding open discussion.
57Conclusion
- Radical explanatory novelty of the ETS model
the relationship between an object and its class
the nature of class and class description the
temporal nature of class and object
descriptions the nature of multilevel class
description the nature of class evolution
- Radical formal/structural novelty of the ETS model
the first temporal form of structural
representation (generaliz. Peano axioms) its
unprecedented representational completeness/transp
arency still ahead the mathematics of such
structural representations (the concept of class,
rather than that of a set, will be the pivotal
point)
58Conclusion
- Undoubtedly, the golden age of our field is
still ahead, and it appears that its arrival
depends on the choice of the right
representational formalism. - The very hart of any scientific enterprise is
the construction of a fundamentally new model,
and much more rarely, of a new formalism. - I do hope that some of you will participate in
this most exciting development.