Experience-Oriented Artificial Intelligence - PowerPoint PPT Presentation

1 / 48
About This Presentation
Title:

Experience-Oriented Artificial Intelligence

Description:

Experience-Oriented Artificial Intelligence Rich Sutton with special thanks to Michael Littman, Doina Precup, Satinder Singh, David McAllester, Peter Stone, Lawrence ... – PowerPoint PPT presentation

Number of Views:185
Avg rating:3.0/5.0
Slides: 49
Provided by: RichS157
Category:

less

Transcript and Presenter's Notes

Title: Experience-Oriented Artificial Intelligence


1
Experience-OrientedArtificial Intelligence
  • Rich Sutton
  • with special thanks to
  • Michael Littman, Doina Precup, Satinder Singh,
  • David McAllester, Peter Stone, Lawrence Saul, and
    Harry Browne

2
Experience matters!
  • Not in the obvious sense - that you have to do a
    thing many times to get good at it
  • But just in the sense that you do things,
  • that you live a life
  • that you take actions, receive sensations
  • that you pass through a trajectory of states
    over time
  • This is so obvious that it passes unnoticed
  • Like air, gravity

3
Experience is
the actions taken and the sensations
received, by the agent from its world a
continuing time sequence over the life of the
agent Experience is the minimal ontology
4
Experience matters, and must be respected
Experience matters because It is what life is all
about. Experience is the final common path,
the only result of all that goes on in the
agent and world
5
Experience matters computationally
  • Experience is the most prominant feature of the
    computational problem we call AI
  • Its the central data structure, revealed and
    chosen over time
  • It has a definite temporal structure
  • Order is important
  • Speed of decision is important
  • There is a continuous flow of long duration (a
    lifetime!)
  • not a sequence of isolated interactions, whose
    order is irrelevant

6
Experience in AI
Many, many AI systems have no experience They
don't have a life! Expert Systems
Knowledge bases like CYC Question-answering
systems Puzzle solvers, or any
planner that is designed to receive
problem descriptions and emit solutions Part of
the new popularity of agent-oriented AI is that
it highlights experience Other AI systems have
experience, but dont respect it
7
Orienting around experiencesuggests radical
changes in AI
Knowledge of the world should be knowledge of
possible experiences Planning should be about
foreseeing and controlling experience The state
of the world should be a summary of past
experience, relevant to future experience Yet
we rarely see these basic AI issues discussed
in terms of experience Is it possible or
plausible that they could be? Yes! Would it
matter if they were? Yes!
8
I am not claiming that knowledge comes from
experience. (I take no position on the
nature/nuture controvery) But only that
knowledge is about experience. And that, given
that, it should be predictive.
9
Key Points
  • Computational Theory vs. just making it work
  • What to compute and why
  • Experience is central to AI
  • Knowledge should be about experience
  • The minimal ontology
  • Grounding in experience from the bottom up
  • A computational theory of knowledge must support
  • Abstraction
  • Composition
  • Decomposition - Explicitness, verifiability
  • Such Modularity is the whole point of knowledge

10
Outline
  • Experience as central to AI
  • Predictive knowledge in General
  • Generalized Transition Predictions (GTPs, or
    option models)
  • Planning with GTPs (rooms-world example)
  • State as predictions (PSRs)
  • Prospects and conclusion

11
The I/O View of the World
We are used to taking an I/O view of the mind, of
the agent It does not matter what it is
physically made of What matters is what it
does So we should be willing to consider the
same I/O view of the world It does not
matter what it is physically made of What
matters is what it does
The only thing that matters about the world is
the experience it generates
12
Then the only thing to know or say about the
world is what experience it generates Thus,
world knowledge must really be about future
experience. In other words, it must be a
prediction
13
AI could be about Predictions
  • Hypothesis Knowledge is predictive
  • About what-leads-to-what, under what ways of
    behaving
  • What will I see if I go around the corner?
  • Objects What will I see if I turn this over?
  • Active vision What will I see if I look at my
    hand?
  • Value functions What is the most reward I know
    how to get?
  • Such knowledge is learnable, chainable,
    verifiable
  • Hypothesis Mental activity is working with
    predictions
  • Learning them
  • Combining them to produce new predictions
    (reasoning)
  • Converting them to action (planning,
    reinforcement learning)
  • Figuring out which are most useful

14
Philosophical and Psychological Roots
  • Like classical british empiricism (16501800)
  • Knowledge is about experience
  • Experience is central
  • But not anti-nativist (evolutionary experience)
  • Emphasizing sequential rather than simultaneous
    events
  • Replace association/contiguity with
    prediction/contingency
  • Close to Tolmans Expectancy Theory (19321950)
  • Cognitive maps, vicarious trial and error
  • Psychology struggled to make it a science
    (18901950)
  • Introspection
  • Behaviorism, operational definitions
  • Objectivity

15
Tolman Honzik, 1930Reasoning in Rats
Food box
Block B
Path 1
Block A
Path 2
Path 3
Start box
16
An old, simple, appealing idea
  • Mind as prediction engine!
  • Predictions are learnable, combinable
  • They represent cause and effect, and can be
    pieced together to yield plans
  • Perhaps this old idea is essentially correct.
  • Just needs
  • Development, revitalization in modern forms
  • Greater precision, formalization, mathematics
  • The computational perspective to make it
    respectable
  • Imagination, determination, patience
  • Not rushing to performance

17
Outline
  • Experience as central to AI
  • Predictive knowledge in general
  • Generalized Transition Predictions (GTPs, or
    option models)
  • Planning with GTPs (rooms-world example)
  • State as predictions (PSRs)
  • Prospects and conclusion

18
Machinery for General Transition Predictions
  • In steps of increasing expressiveness
  • Simple state-transition predictions
  • Mixtures of predictions
  • Closed-loop termination
  • Closed-loop action conditioning

19
The Simplest Transition Predictions
state
action
Experience
1-step Prediction
a
A
B
k-step Prediction
p
A
B
20
Mixtures of k-step Predictions Terminating over
a period of time
time steps of interest
Where will I be in 1020 steps? Where will I be
in roughly k steps?
now
k10 steps
k20 steps
Arbitrary termination profiles are possible
now
k steps
short term
But sometimes anything like this is too loose and
sloppy...
medium term
long term
21
Closed-loop Termination
  • Terminate depending on what happens
  • E.g., instead of Will I finish this report soon
    which uses a soft termination profile
  • Use Will I be done when my boss gets here?

1 hr
probably in about an hour
Prob.
time
boss arrives
1
only one precise but uncertain time matters
Prob.
0
22
Closed-loop terminationallows time specification
to be both flexible and precise
  • Instead of what will I see at t100?
  • Can say what will I see when I open the box?
  • Will we elect a black or a woman president first?
  • Where will the tennis ball be when it reaches me?
  • What time will it be when the talk starts?

or when John arrives? when the bus
comes? when I get to the store?
A substantial increase in expressiveness
23
Closed-loop Action Conditioning
  • Each prediction has a closed-loop policy
    Policy States --gt Actions (or Probs.)
  • If you follow the policy, then you predict and
    verify
  • Otherwise not
  • If partly followed, temporal-difference methods
    can be used

24
General Transition Predictions (GTPs)
Closed-loop terminations and policies Correspond
to arbitrary experiments and the results of
those experiments What will I see if I go into
the next room? What time will it be when the talk
is over? Is there a dollar in the wallet in my
pocket? Where is my car parked? Can I throw the
ball into the basket? Is this a chair
situation? What will I see if I turn this object
around?
25
Anatomy of a General Transition Prediction
States
Measurement space
1 Predictor Recognizes the conditions, makes
the prediction 2 Experiment - policy -
termination condition - measurement function(s)
knowledge
Actions
verifier
26
Room-to-Room GTPs (General Transition
Predictions)
Sutton, Precup, Singh, 1999
Options Precup 2000 Sutton, Precup, Singh
1999
Target (goal) hallway
4 stochastic primitive actions
Policy
u
p
F
a
i
l

3
3


r
i
g
h
t
l
e
f
t
o
f

t
h
e

t
i
m
e

Termination hallways
d
o
w
n
8 multi-step GTPs
(
t
o

e
a
c
h

r
o
o
m
'
s

2

h
a
l
l
w
a
y
s
)
Predict Probability of reaching each terminal
hallway Goal minimize steps values for
target and other outcome hallway
27
Example Open-the-door
  • Predictor Use visual input to estimate
  • Probabilities of succeeding in opening the door,
    and of other outcomes (door locked, no handle, no
    real door)
  • expected cumulative cost (sub-par reward) in
    trying
  • Experiment
  • Policy for walking up to the door, shaping grasp
    of handle, turning, pulling, and opening the door
  • Terminate on successful opening or various
    failure conditions
  • Measure outcome and cumulative cost

28
Example RoboCup Soccer Pass
  • Predictor uses perceived positions of ball,
    opponents, etc. to estimate probabilities of
  • Successful pass, openness of receiver
  • Interception
  • Reception failure
  • Aborted pass, in trouble
  • Aborted pass, something better to do
  • Loss of time
  • Experiment
  • Policy for maneuvering ball, or around ball, to
    set up and pass
  • Termination strategy for aborting, recognizing
    completion
  • Measurement of outcome, time

29
Outline
  • Experience as central to AI
  • Predictive knowledge in General
  • Generalized Transition Predictions (GTPs, or
    option models)
  • Planning with GTPs (rooms-world example)
  • State as predictions (PSRs)
  • Prospects and conclusion

30
Combining Predictions
  • If the mind is about predictions,
  • Then thinking is combining predictions to produce
    new ones
  • Predictions obviously compose
  • If A-gtB and B-gtC, then A-gtC
  • GTPs are designed to do this generally
  • Fit into Bellman equations of semi-Markov
    extensions of dynamic programming
  • Can also be used for simulation-based planning

31
Composing Predictions
A
B
B
C
A
C
Final measurement (e.g., partial distribution
of outcome states)
Transient measurement (e.g., elapsed time,
cumulative reward)
32
Composing Predictions
B .1
A
B
B
C
.8
B .1
B .1
p
b
p
b

then if B
A
C
1
1
2
2
.8
T
.
8
T

B .1
1
2
33
Room-to-Room GTPs (General Transition
Predictions)
Sutton, Precup, Singh, 1999
Options Precup 2000 Sutton, Precup, Singh
1999
Target (goal) hallway
4 stochastic primitive actions
Policy
u
p
F
a
i
l

3
3


r
i
g
h
t
l
e
f
t
o
f

t
h
e

t
i
m
e

Termination hallways
d
o
w
n
8 multi-step GTPs
(
t
o

e
a
c
h

r
o
o
m
'
s

2

h
a
l
l
w
a
y
s
)
Predict Probability of reaching each terminal
hallway Goal minimize steps values for
target and other outcome hallway
34
Planning with GTPs
(GTPs)
35
Learning Path-to-Goal with and without GTPs
Primitives
GTPs primitives
GTPs
36
Rooms Example Simultaneous Learning of all 8
GTPs from their Goals
0
.
4
0
.
3
goal prediction
0
.
2
0
.
1
0
0
2
0
,
0
0
0
4
0
,
0
0
0
6
0
,
0
0
0
8
0
,
0
0
0
1
0
0
,
0
0
0
All 8 hallway GTPs were learned accurately and
efficiently while actions are selected totally at
random
37
Outline
  • Experience as central to AI
  • Predictive knowledge in General
  • Generalized Transition Predictions (GTPs, or
    option models)
  • Planning with GTPs (rooms-world example)
  • State as predictions (PSRs)
  • Prospects and conclusion

38
Predictive State Representations
  • Problem So far we have assumed statesbut world
    really just gives information, observations
  • Hypothesis What we normally think of as stateis
    a set of predictions about outcomes of
    experiments
  • Wallets contents, Johns location, presence of
    objects
  • Prior work
  • Learning deterministic FSAs - Rivest Schapire,
    1987
  • Adding stochasticity An alternative to HMMs -
    Herbert Jaeger, 1999
  • Adding action An alternative to POMDPs -
    Littman, Sutton, Singh 2001

39
Summary of Results for Predictive State Repns
(PSRs)
  • Exist compact, linear PSRs
  • tests states in minimal POMDP
  • tests Rivest Schapires Diversity
  • tests can be exponentially fewer than diversity
    and POMDP
  • Compact simulation/update process
  • Construction algorithm from POMDP
  • Learning/discovery algorithms of Rivest and
    Schapire, and of Jaeger, do not immediately
    extend to PSRs
  • There are natural EM-like algorithms (current
    work)

40
Empty Gridworld with Local Sensing
Four actions Up, Down, Right, Left And four
sensory bits
41
Distance to Wall Predictions
0 R 0 RR 1 RRR 1 RRRR . . . 0 D 1 DD 1 DDD
. . .
meaning of predictions
4 GTPs suffice to identify each state More needed
to update PSR Many more are computed from PSR
Predictive State Representation (PSR)
42
Suppose we add one non-uniformity
0 R 0 RR 1 RRR 1 RRRR . . . 0 D 1 DD 1 DDD
. . .
Now there is much more to know It would be
challenging to program it all correctly
43
Other Extension Ideas
  • Stochasticity
  • Egocentric motion
  • Multiple Rooms
  • Second agent
  • Moveable objects
  • Transient goals

Its easy to make such problems arbitrarily
challenging
44
Outline
  • Experience as central to AI
  • Predictive knowledge in general
  • Generalized Transition Predictions (GTPs, or
    option models)
  • Planning with GTPs (rooms-world example)
  • State as predictions (PSRs)
  • Prospects and conclusion

45
How Could These Ideas Proceed?
  • Build systems! Build Gridworlds!
  • A performance orientation would be problematic
  • The Knowledge Representation guys may not be
    impressed
  • But others I think will be very interested and
    appreciative - throughout modern probabalistic AI

46
The Experience Manifesto
Experience is the input and output of AI An AI
must have experience it must have a
life! Knowledge is about experience Not about
objects, or people, or space, or timeexcept in
so far as these things can be restated in
terms of experience. Knowledge is well expressed
as predictions of experience Predictions of
experience have a much clearer meaning than
any previously proposed kind of
knowledge Predictions of experience can be
autonomously verified Predictive knowledge is
completely in the machine, not in a
person! Planning is about composing predictions
to search through the space of attainable
experiences World-state repns are also
predictions of experience
47
Key Points
  • We should not try to fake intelligence or
    understanding
  • Computational Theory vs. just making it work
  • What to compute and why
  • Experience is central to AI
  • Knowledge should be about experience
  • The minimal ontology
  • Grounding in experience from the bottom up
  • A computational theory of knowledge must support
  • Abstraction
  • Composition
  • Decomposition - Explicitness, verifiability
  • Such Modularity is the whole point of knowledge

48
Summary of the Predictive View of AI
  • Knowledge is Predictions
  • About what-leads-to-what, under what ways of
    behaving
  • Such knowledge is learnable, chainable
  • Mental activity is working with predictions
  • Learning them
  • Combining them to produce new predictions
    (reasoning)
  • Converting them to action (planning,
    reinforcement learning)
  • Figuring out which are most useful
  • Predictions are verifiable
  • A natural way to self-maintain knowledge,which
    is essential for scaling AI beyond programming
  • Most of the machinery is simple but potentially
    powerful
  • Is it powerful enough?
Write a Comment
User Comments (0)
About PowerShow.com