Computational Discovery of Communicable Knowledge

About This Presentation

Title:

Computational Discovery of Communicable Knowledge

Description:

Institute for the Study of Learning and Expertise. Palo Alto, California USA ... Historical Trends. Work on learning plan knowledge has seen many shifts in fashion: ... – PowerPoint PPT presentation

Number of Views:47

Avg rating:3.0/5.0

Slides: 41

Provided by: Lang8

Learn more at: http://www.isle.org

Category:

more less

Transcript and Presenter's Notes

Title: Computational Discovery of Communicable Knowledge

1
Challenges in Learning Plan Knowledge
Pat Langley School of Computing and
Informatics Arizona State University Tempe,
Arizona USA Institute for the Study of Learning
and Expertise Palo Alto, California USA
Thanks to D. Choi, T. Konik, U. Kutur, N. Li, D.
Nau, N. Nejati, and D. Shapiro for their many
contributions. This talk reports research funded
by grants from DARPA IPTO, which is not
responsible for its contents.
2
Outline of the Talk

Brief review of learning plan knowledge
Learning from different sources
Learning for new performance tasks
Learning in different scenarios
Learning with novel representations
Some responses to these challenges
Concluding remarks

3
The Problem Learning Plan Knowledge

Given Basic knowledge about some action-oriented
domain. (e.g., state/goal representation,
operators)
Given A set of training problems (e.g., initial
states, goals, and possibly more)
Given Some performance task that the system must
carry out.
Given A performance mechanism that can use
knowledge to carry out that task.
Learn Knowledge that will let the system improve
its ability to perform new tasks from the same
or similar domain.

4
Topics Not Covered
This talk will range widely, but I will not cover
issues related to

Learning with impoverished representations
Interested in human-like, intelligent behavior
Most work on reinforcement learning is irrelevant
Acquiring basic knowledge about domain
Interested in building on such knowledge
Most work on learning action models is too basic
Nonincremental learning from large data sets
Interested in human-like incremental learning
This rules out most data-mining approaches

5
Historical Topics
There has been a long history of work on learning
plan knowledge

Forming macro-operators
Fikes et al. (1972), Iba (1988), Mooney (1989),
Botea et al. (2005)
Inducing forward-chaining control rules
Anzai Simon (1978) Mitchell et al. (1981),
Langley (1982)
Learning control rules analytically
Laird et al. (1986), Mitchell et al. (1986),
Minton (1988)
Problem solving by analogy
Veloso (1994), Jones Langley (1995), VanLehn
Jones (1994)
Inducing control rules for partial-order plans
Kautukam Kambhampati (1994), Estlin Mooney
(1997)

6
Historical Trends
Work on learning plan knowledge has seen many
shifts in fashion

Early hope for improving problem
solvers/planners (1978?1985)
Excitement/confusion introduced by EBL movement
(1986?1992)
Some doubts raised by the utility problem
(1988?1993)
Mass migration to reinforcement learning
paradigm (1993?2003)
Resurgence of interest in learning plan
knowledge (2004?present)

Throughout these changes, the problems and
potential of learning plan knowledge have
remained.
7
Traditional Sources of Information
Most research on learning for planning has
assumed the system uses search to generate

Successful paths that achieve the goals (positive
instances)
Failed paths that do not achieve the goals
(negative instances)
Alternative paths of different desirability
(preferred instances)

But humans learn from other sources of
information and our AI systems should as well.
8
Challenge Learn from Many Sources
There has been relatively little research on plan
learning from

Demonstrations of solved problems (Nejati et al.,
2006)
Explicit instruction from teacher (Blythe et al.,
2007)
Advice or hints from teacher (Mostow, 1983)
Mental simulations or daydreaming (Mueller, 1985)
Undesirable side effects during execution

Humans learn from all of these sources, and our
learning systems should support the same
capabilities. Moreover, we should develop
single systems that integrate plan knowledge
learned from all of them (Oblinger, 2006).
9
Traditional Performance Tasks
Most research on learning for planning has
assumed the system aims to improve

The efficiency of plan generation (nodes
expanded, time)
The quality of generated plans (path length,
utility)
The coverage of plan knowledge (problems solved)

But humans learn and use plan knowledge for
other purposes that are just as valid.
10
Challenge Learn for Plan Execution
Many important domains require executing plan
knowledge in some environment that includes

operators with likely but nonguaranteed effects
external events not directly under the agents
control
other agents that are pursuing their own goals

Urban driving is one setting that raises all
three of these issues. Complex board games like
chess, although deterministic, still require
interleaving of planning and execution. We need
more research on plan learning in contexts of
this sort (e.g., Benson, 1995 Fern et al.,
2004).
11
Challenge Learn for Plan Understanding
Another understudied problem is learning for plan
understanding.

Given A partially observed sequence of states
influenced by another agents actions.
Given Learned knowledge about how to achieve
goals.
Find The other agents goals and the plans it
is pursuing to achieve them.

Plan understanding is important not only in
complex games, but in military planning,
politics, and other settings. This performance
task suggests new learning problems, methods, and
evaluation criteria.
12
Traditional Learning Scenarios
Most research on learning for planning has
assumed the system

Trains on problems from a given distribution /
domain
Tests on problems from the same distribution /
domain

Success depends on the extent to which the
learner generalizes well to new problems from the
same domain. But humans also use their learned
plan knowledge in other, more flexible ways to
improve performance.
13
Challenge Cumulative Learning
In complex domains, humans learn plan knowledge
gradually

Starting with small, relatively easy problems
Moving to complex problems after mastering
simpler ones

Later acquisitions build naturally on earlier
experience, learning to cumulative learning.
Our education system depends heavily on such
vertical transfer of learned knowledge. We
need more learning systems that demonstrate this
form of cumulative improvement (e.g., Reddy
Tadepalli, 1997).
14
Challenge Cross-Domain Transfer
In other cases, humans exhibit a form of transfer
that involves

Learning to solve problems in one domain
Reusing this knowledge to solve problems in
another domain that is superficially quite
different

Such cross-domain transfer is related to
within-domain analogical reasoning, but it is far
more challenging. In its extreme form, the two
domains support similar solutions but have no
shared symbols or predicates. We need more
learning systems that demonstrate this radical
form of knowledge reuse.
15
Traditional Learned Representations
Most research on learning for planning has
focused on learning

Control rules that reduce effective branching
factor
Macro-operators that reduce effective solution
depth

These grew naturally from representations used to
create hand-crafted expert problem solvers. But
now we have other representations of plan
knowledge that suggest new learning tasks and
methods. Nor does this refer to POMDPs,
workflows, or other highly constrained
formalisms.
16
Challenge Learn HTNs
Hierarchical task networks (HTNs) offer the most
effective planning available, but they are
expensive to build manually. HTNs provide an
ideal target for learning because they have

the modularity and flexibility of search-control
rules
the large-scale structure of macro-operators

Machine learning has automated the creation of
expert classifiers. We should do the same for
HTNs, which are effectively expert planning
systems.
17
Challenge Learn HTNs
We can define the task of learning hierarchical
task networks as

Given Basic knowledge about some action-oriented
domain
Given A set of training problems (initial states
and goals)
Given Some performance task the system must
carry out.
Given Some module that uses HTNs to perform this
task
Learn An HTN that lets the system improve its
performance on new tasks from the same or
similar domain.

We need more research on this important topic
(e.g., Reddy Tadepalli, 1997 Ilghami et al.,
2005).
18
Some Responses
Our recent research attempts to respond to these
challenges by developing methods that

acquire a constrained but important class of HTNs
that one can use for both planning and reactive
control
from both successful problem solving and expert
traces
that extends naturally to support cross-domain
transfer

Moreover, these ideas are embedded in an
integrated architecture that supports many
capabilities ? ICARUS (Langley, 2006).
19
Conceptual Knowledge in ICARUS
Nonprimitive Concept (patient-form-filled
?patient)
Primitive Concept (assigned-mission ?patient
?mission)

Conceptual knowledge is cast as Horn clauses that
specify relevant relations in the environment
Memory is organized hierarchically
Divided into primitive and non-primitive
predicates

20
HTN Methods in ICARUS
HTN goal concept
subgoal
HTN method
precondition concept
HTN method
operator

Similar to SHOP2 but methods indexed by goals
they achieve
Each method decomposes a goal into subgoals
If a methods goal is active and its precondition
is satisfied, then try to achieve its subgoals or
apply its operators

21
Operators in ICARUS
Action (get-arrival-time ?patient ?from ?to)
Effects Concept (arrival-time ?patient)
Precondition Concept (patient ?p)
and (travel-from ?p ?from) and (travel-to ?p ?to)

Operators describe low-level actions that agents
can execute directly in the environment
Preconditions legal conditions for action
execution
Effects expected changes when action is executed

22
Training Input Expert Traces and Goals
Operator instance (get-arrival-time P2)
Goal concept (all-patients-arranged)
State
Concept instance (assigned-flight P1 M1)

Expert demonstration traces
Operators the expert uses and the resulting
belief state
State Set of concept instances
Goal is a concept instance in the final state
ICARUS learns generalized skills that achieves
similar goals

23
Learning Plan Knowledge from Demonstration
Reactive Executor
Problem
Plan Knowledge
?
Initial State
goal
Learned plan knowledge
If Impasse
HTNs
Demonstration Traces
Expert
States and actions
Operators
Concept definitions
Background knowledge
24
Learning HTNs by Trace Analysis
concepts
actions
25
Learning HTNs by Trace Analysis
Operator Chaining
26
Learning HTNs by Trace Analysis
Concept Chaining
concepts
actions
27
Explanation Structure for Trace
(transfer-hospital patient1 hospital2)
(arrange-ground-transportation SFO hospital2 1pm)
Time3
(location patient1 SFO 1pm)
(close-airport hospital2 SFO)
(assigned patient1 NW32)
(arrival-time NW32 1pm)
(dest-airport patient1 SFO)
(query-arrival-time)
(assign patient1 NW32)
Time1
Time2
(scheduled NW32)
(flight-available)
28
Hierarchical Task Network Structure
(transfer-hospital ?patient ?hospital)
(close-airport ?hospital ?loc)
(arrange-ground-transportation ?loc ?hospital
?time)
(location ?patient ?loc ?time)
(assigned ?patient ?flight)
(arrival-time ?flight ?time)
(dest-airport ?patient ?loc)
(scheduled ?flight)
(flight-available)
(query-arrival-time)
(assign ?patient ?flight)
29
Transfer by Representation Mapping
Source domain
Target domain
concepts
Predicate mappings
actions
30
Challenge Learn with Richer Goals
HTNs are more expressive than classical plans
(Errol et al., 1994). Our approach loses this
advantage because it assumes the head of each
method is a goal it achieves, but we can

Extend goal concepts to describe temporal
behavior
Revise the execution module to handle these
structures
Augment trace analysis to reason about temporal
goals
Learn new methods with temporal goals in their
heads

This scheme should acquire the full class of HTNs
while still retaining the tractability of
goal-directed learning.
31
Challenge Extend Conceptual Vocabularies
Our approach to learning HTNs relies on the
concept hierarchy used to explain solution
traces. The method would be less dependent if
it extended this hierarchy

Given A set of concepts used in goals, states,
and methods
Given New methods acquired from sample solution
traces
Find New concepts that produce improved
performance as the result of future method
learning.

This would support a bootstrapped learner that
invents predicates to describe states, goals, and
methods.
32
Challenge Extend Conceptual Vocabularies
Our approach to utilizing predicate invention has
three steps

Define a new concept for the precondition of each
method learned by chaining off a concept
definition.
Check traces for states in which this concept
becomes true and learn methods to achieve it.
During performance, treat each methods
precondition as its first subgoal, which it can
achieve if submethods are known.

This technique would make an HTN more complete by
growing it downward, introducing nonterminal
symbols as necessary. We have partially
implemented this scheme and hope to report
results at the next meeting.
33
Concluding Remarks Research Style
Clearly, there remain many open problems to
address in learning plan knowledge. These
involve new abilities, not improvements on
existing ones, which suggests that we

Look at human behavior for ideas on how to
proceed
Develop integrated systems rather than component
algorithms
Demonstrate their behavior on challenging domains

These strategies will help us extend the reach of
our learning systems, not just strengthen their
grasp.
34
Concluding Remarks Evaluation
We must evaluate our new plan learners, but this
does not mean

Measuring their speed in generating plans
Showing they run faster than existing systems
Entering them in planning competitions

More appropriate experiments would revolve
around

Demonstrating entirely new functionalities
Running lesion studies to show new features are
required
Using performance measures appropriate to the task

These steps will produce conceptual advances and
scientific understanding far more than will
mindless bake-offs.
35
Concluding Remarks Summary
Learning plan knowledge is a key area with many
open problems

Learning from traces, advice, and other sources
Transferring knowledge within and across domains
Learning and extending rich structures like HTNs

These challenges will benefit from earlier work
on plan learning, but they also require new
ideas. Together, they should lead us toward
learning systems that rival humans in their
flexibility and power.
36
End of Presentation
37
ICARUS Concepts for In-City Driving
((in-rightmost-lane ?self ?clane) percepts
( (self ?self) (segment ?seg) (line ?clane
segment ?seg)) relations ((driving-well-in-segme
nt ?self ?seg ?clane) (last-lane ?clane) (not
(lane-to-right ?clane ?anylane)))) ((driving-well
-in-segment ?self ?seg ?lane) percepts ((self
?self) (segment ?seg) (line ?lane segment ?seg))
relations ((in-segment ?self ?seg) (in-lane
?self ?lane) (aligned-with-lane-in-segment ?self
?seg ?lane) (centered-in-lane ?self ?seg
?lane) (steering-wheel-straight
?self))) ((in-lane ?self ?lane) percepts
( (self ?self segment ?seg) (line ?lane segment
?seg dist ?dist)) tests ( (gt ?dist -10)
(lt ?dist 0)))
38
Representing Short-Term Beliefs/Goals
(current-street me A) (current-segment me
g550) (lane-to-right g599 g601) (first-lane
g599) (last-lane g599) (last-lane
g601) (at-speed-for-u-turn me) (slow-for-right-tur
n me) (steering-wheel-not-straight
me) (centered-in-lane me g550 g599) (in-lane me
g599) (in-segment me g550) (on-right-side-in-segme
nt me) (intersection-behind g550
g522) (building-on-left g288) (building-on-left
g425) (building-on-left g427) (building-on-left
g429) (building-on-left g431) (building-on-left
g433) (building-on-right g287) (building-on-right
g279) (increasing-direction me) (buildings-on-righ
t g287 g279)
39
ICARUS Skills for In-City Driving
((in-rightmost-lane ?self ?line) percepts
((self ?self) (line ?line)) start
((last-lane ?line)) subgoals ((driving-well-in-s
egment ?self ?seg ?line))) ((driving-well-in-seg
ment ?self ?seg ?line) percepts ((segment
?seg) (line ?line) (self ?self)) start
((steering-wheel-straight ?self)) subgoals
((in-segment ?self ?seg) (centered-in-lane ?self
?seg ?line) (aligned-with-lane-in-segment ?self
?seg ?line) (steering-wheel-straight
?self))) ((in-segment ?self ?endsg) percepts
((self ?self speed ?speed) (intersection ?int
cross ?cross) (segment ?endsg street ?cross
angle ?angle)) start ((in-intersection-fo
r-right-turn ?self ?int)) actions ((?steer
1)))
40
ICARUS Interleaves Execution and Problem Solving
Skill Hierarchy
Problem
Reactive Execution
?
no
impasse?
Primitive Skills
Executed plan
yes
Problem Solving
This organization reflects the psychological
distinction between automatized and controlled
behavior.

Write a Comment

User Comments (0)