Title: Self-Improvement through Self-Understanding:
1Self-Improvement through Self-Understanding Model
-Based Reflection for Agent Adaptation
J. William Murdock Intelligent Decision Aids
Group Navy Center for Applied Research in
Artificial Intelligence Naval Research
Laboratory, Code 5515 Washington, DC
20375 bill_at_murdocks.org http//bill.murdocks.org
Presentation at NIST March 18, 2002
2Adaptation
- People adapt very well.
- They figure out how to do new things.
- If something doesnt work, they try something
else. - They understand how and why they are doing
things. - Computer programs do not adapt very well.
- They can only do what they are programmed for.
- They keep making the same mistakes.
- They have no understanding of themselves.
Can we make computer programs adapt?
3REM(Reflective Evolutionary Mind)
- Operating environment for intelligent agents
- Provides support for adaptation to new functional
requirements - Uses functional models, generative planning, and
reinforcement learning - J. William Murdock and Ashok K. Goel
4ExampleWeb Browsing Agent
- A mock-up of web browsing software
- Based on Mosaic for X Windows, version 2.4
- Imitates not only behavior but also internal
process and information of Mosaic 2.4
ps
???
html
pdf
txt
5ExampleDisassembly and Assembly
- Software agent for disassembly in the domain of
cameras - Information about cameras
- Information about relevant actions
- e.g., pulling, unscrewing, etc.
- Information about disassembly processing
- e.g., decide how to disconnect subsystems from
each other and then decide how to disassemble
those subsystems separately. - Agent now needs to assemble a camera
6TMK (Task-Method-Knowledge)
- TMK models provide the agent with knowledge of
its own design. - TMK encodes
- Tasks functional specification / requirements
and results - Methods behavioral specification / composition
and control - Knowledge Domain concepts and relations
7REM Reasoning Process
...
Implemented Task
Execution
...
A Method
...
ADAPTED Implemented Task
Trace
...
Set of Input Values
ADAPTED Method
Set of Output Values
Unimplemented Task
Adaptation
Set of Input Values
8Adaptation Process
Generative Planning
...
Task
ADAPTED Implemented Task
Situator (for Q-Learning)
...
Set of Input Values
ADAPTED Method
Proactive Model Transfer
...
...
Existing Method
Similar Implemented Task
Failure-Driven Model Transfer
...
Trace
A Method
9Execution Process
...
Implemented Task
Select Method
...
A Method
Trace
Select Next Task Within Method
Set of Input Values
Set of Output Values
Execute Primitive Task
10Selection Q-Learning
- Popular, simple form of reinforcement learning.
- In each state, each possible decision is assigned
an estimate of its potential value (Q). - For each decision, preference is given to higher
Q values. - Each decision is reinforced, i.e., its Q value
is altered based on the results of the actions. - These results include actual success or failure
and the Q values of next available decisions.
11Q-Learning in REM
- Decisions are made for method selection and for
selecting new transitions within a method. - A decision state is a point in the reasoning
(i.e., task, method) plus a set of all decisions
which have been made in the past. - Initial Q values are set to 0.
- Decides on option with highest Q value or
randomly selects option with probabilities
weighted by Q value (configurable). - A decision receives positive reinforcement when
it leads immediately (without any other
decisions) to the success of the overall task.
12Task-Method-Knowledge Language (TMKL)
- A new, powerful formalism of TMK developed for
REM. - Uses LOOM, a popular off-the-shelf knowledge
representation framework concepts, relations,
etc.
REM models not only the tasks of the domain but
also itself in TMKL.
13Tasks in TMKL
- All tasks can have input output parameter lists
and given makes conditions. - A non-primitive task must have one or more
methods which accomplishes it. - A primitive task must include one or more of the
following source code, a logical assertion, a
specified output value. - Unimplemented tasks have neither of these.
14TMKL Task
- (define-task communicate-with-www-server
- input (input-url)
- output (server-reply)
- makes
- (and
- (document-at-location (value server-reply)
- (value
input-url)) - (document-at-location (value server-reply)
-
local-host)) - by-mmethod (communicate-with-server-method))
15Methods in TMKL
- Methods have provided and additional result
conditions which specify incidental requirements
and results. - In addition, a method specifies a start
transition for its processing control. - Each transition specifies requirements for using
it and a new state that it goes to. - Each state has a task and a set of outgoing
transitions.
16Simple TMKL Method
- (define-mmethod external-display
- provided (not (internal-display-tag (value
server-tag))) - series (select-display-command
- compile-display-command
- execute-display-command))
17Complex TMKL Method
- (define-mmethod make-plan-node-children-mmethod
- series (select-child-plan-node
- make-subplan-hierarchy
- add-plan-mappings
- set-plan-node-children))
- (tell (transitiongtlinks make-plan-node-children-mm
ethod-t3 - equivalent-plan-nodes
- child-equivalent-plan-nod
es) - (transitiongtnext make-plan-node-children-mm
ethod-t5 - make-plan-node-children-mm
ethod-s1) - (create make-plan-node-children-terminate
transition) - (reasoning-stategttransition
make-plan-node-children-mmethod-s1 -
make-plan-node-children-terminate) - (about make-plan-node-children-terminate
- (transitiongtprovided
- '(terminal-addam-value (value
child-plan-node)))))
18Knowledge in TMKL
- Foundation LOOM
- Concepts, instances, relations
- Concepts and relations are instances and can have
facts about them.
Knowledge representation in TMKL involves LOOM
some TMKL specific reflective concepts and
relations.
19Some TMKLKnowledge Modeling
- (defconcept location)
- (defconcept computer
- is-primitive location)
- (defconcept url
- is-primitive location
- roles (text))
- (defrelation text
- range string
- characteristics single-valued)
- (defrelation document-at-location
- domain reply
- range location)
- (tell (external-state-relation
- document-at-location))
20Sample Meta-Knowledge in TMKL
- relation characteristics
- single-valued/multiple-valued
- symmetric, commutative
- relations over relations
- external/internal
- state/definitional
- generic relations
- same-as
- instance-of
- inverse-of
- concepts involving concepts
- thing
- meta-concept
- concept
21Web Browsing Agent
Mock-up of a web browser Steps through the
web-browsing process
- Interactive Domain Web agent is affected by the
user and by the network - Dynamic Domain Both users and networks often
change - Knowledge Intensive Domain Documents, networks,
servers, local software, etc.
22Tasks and Methodsof Web Agent
Process URL
Process URL Method
Communicate with WWW Server
Display File
Communicate with WWW Server Method
Display File Method
Request from Server
Receive from Server
Interpret Reply
Display Interpreted File
External Display
Internal Display
Execute Internal Display
Select Display Command
Compile Display Command
Execute Display Command
23Example PDF Viewer
- The web agent is asked to browse the URL for a
PDF file. It does not have any information about
external viewers for PDF. - Because the agent already has a task for browsing
URLs it is executed first. - When the system fails, the user provides feedback
indicating the correct viewer. - Failure-Driven Model Transfer
24Web Agent Adaptation
...
External Display
Select Display Command
Compile Display Command
Execute Display Command
...
External Display
Compile Display Command
Execute Display Command
Select Display Command
Select Display Command Base Method
Select Display Command Alternate Method
Select Display Command Base Task
Select Display Command Alternate Task
25Physical Device Disassembly
- ADDAM Legacy software agent for case-based,
design-level disassembly planning and (simulated)
execution - Interactive Agent connects to a user specifying
goals and to a complex physical environment - Dynamic New designs and demands
- Knowledge Intensive Designs, plans, etc.
26Disassembly ? Assembly
- A user with access to ADDAM disassembly agent
wishes to have this agent instead do assembly. - ADDAM has no assembly method thus must adapt
first. - Since assembly is similar to disassembly, REM
selects Proactive Model Transfer.
27Pieces of ADDAM which are key to Disassembly ?
Assembly
Disassemble
Plan Then Execute Disassembly
Adapt Disassembly Plan
Execute Plan
Hierarchical Plan Execution
Topology Based Plan Adaptation
Make Plan Hierarchy
Map Dependencies
Select Next Action
Execute Action
Select Dependency
Assert Dependency
Make Equivalent Plan Nodes Method
Make Equivalent Plan Node
Add Equivalent Plan Node
28New Adapted Task inDisassembly ? Assembly
Assemble
COPIED Plan Then Execute Disassembly
COPIED Adapt Disassembly Plan
COPIED Execute Plan
COPIED Hierarchical Plan Execution
COPIED Topology Based Plan Adaptation
COPIED Make Plan Hierarchy
COPIED Map Dependencies
Select Next Action
INSERTED Inversion Task 2
Execute Action
COPIED Select Dependency
INVERTED Assert Dependency
COPIED Make Equivalent Plan Nodes Method
COPIED Add Equivalent Plan Node
INSERTED Inversion Task 1
COPIED Make Equivalent Plan Node
29Task Assert Dependency
- Before
- define-task Assert-Dependency
- input target-before-node, target-after-node
- asserts (node-precedes (value
target-before-node) - (value target-after-node))
- After
- define-task Mapped-Assert-Dependency
- input target-before-node, target-after-node
- asserts (node-follows (value
target-before-node) - (value target-after-node)))
30Task Make Equivalent Plan Node
- define-task make-equivalent-plan-node
- input base-plan-node, parent-plan-node,
equivalent-topology-node - output equivalent-plan-node
- makes (and
- (plan-node-parent (value
equivalent-plan-node) -
(value parent-plan-node)) - (plan-node-object (value
equivalent-plan-node) -
(value equivalent-topology-node)) - (implies (plan-action (value
base-plan-node)) - (type-of-action
(value equivalent-plan-node) -
(type-of-action (value base-plan-node))))) - by procedure ...
31TaskInverted-Reversal-Task
- define-task inserted-reversal-task
- input equivalent-plan-node
- asserts (type-of-action
- (value equivalent-plan-node)
- (inverse-of
- (type-of-action
- (value
equivalent-plan-node))))
32ADDAM Example Layered Roof
33Roof Assembly
34Modified Roof Assembly No Conflicting Goals
35Applicability ofProactive Model Transfer
- Knowledge about the concepts and relations in the
domain - Knowledge about how the tasks and methods affect
these concepts and relations - Differences between the old task and the new map
onto knowledge of the concepts and relations in
the domain.
36Applicability ofFailure-Driven Model Transfer
- May need less knowledge about the domain itself
since the adaptation is grounded in a specific
incident. - e.g., feedback about PDF for an example instead
of advance knowledge of all document types. - Still requires knowledge about how the tasks and
methods interact with the domain.
37Additional Mechanisms
- Model-based adaptation may leave some design
decisions unsolved. - These decisions may be solved by traditional
decision making mechanisms, e.g., reinforcement
learning. - Models may be unavailable or irrelevant for some
tasks or subtasks - Generative planning can combine primitive actions.
38Level of Decomposition
- Level of decomposition may be dictated by the
nature of the agent. - Some tasks simply cannot be decomposed
- In other situations, level of decomposition may
be guided by the nature of adaptation to be done. - Can be brittle if unpredicted demands arise.
- REM enables autonomous decomposition of
primitives which addresses this problem.
39Computational Costs
- Reasoning about models incurs some costs.
- For very easy problems, this overhead may not be
justified. - For other problems, the benefits enormously
outweigh these costs.
Models can localize planning and learning.
40Knowledge Requirements
- Someone has to build an agent.
- Builder should know what that agent does and how
it does it ? Can make model. - Analyst may be able to understand builders
notes, etc. ? Can make model - Some evidence for this in the context of software
engineering / architectural extraction.
41Current Work AHEAD
- Theme Analyzing hypotheses regarding asymmetric
threats (e.g., criminals, terrorists). - Input Hypotheses regarding a potential threat
- Output Argument for and/or against the
hypotheses - Technique Analogy over functional models
- An extension to TMKL will encode known behaviors
for asymetric threats and the purposes that the
behaviors serve. - Analogical reasoning will enable retrieval and
mapping of new hypotheses to existing models. - Models will provide arguments about how observed
actions do or do not support the purposes of the
hypothesized behavior. - Naval Research Laboratory / DARPA Evidence
Extraction and Link Discovery program - David Aha, J. William Murdock, Len Breslow
42Summary
- REM (Reflective Evolutionary Mind)
- Operating environment for agents that adapt
- TMKL (Task-Method-Knowledge Language)
- The language for agents in REM
- Functional modeling language for encoding
computational processes - Adaptation
- Some kinds of adaptation can be performed using
specialized model-based techniques - Others require more generic planning learning
mechanisms (localized using models)
43Optional Slides
44REM vs.Derivational Analogy
- REM adapts models of tasks and methods.
- Derivational analogy generally assumes some sort
of universal process (e.g., generative planning)
and only needs to represent and reason about key
decision points. - Advantage of derivational analogy Models not
needed traces alone enable reuse. - Advantage of REM Applicable to problems for
which a universal process is not appropriate
(e.g., 6 board roof example takes days using
planning Q-learning). - REM demands more knowledge but makes effective
use of that additional knowledge.
45REM andCase-Based Reasoning (CBR)
- Given a task, REM retrieves a method, applies it,
and then (if necessary) adapts it. This process
is a form of CBR. - Most CBR projects, however, adapt solutions not
processes. Some problems require the latter. - Adaptation of processes can enable extending the
efficiency benefits of CBR to problems which the
case library does not directly address.
46REM vs.Case-Based Adaptation
- REM reasons about and adapts an entire reasoning
process. - Case-based adaptation restricts adaptation to one
portion of a case-based process adaptation. - Being more focused is a substantial advantage for
case-based adaptation. - However, for problems which require adaptation of
different sorts of reasoning processes, it is
useful to have models of these processes, as in
REM.
47Q-Learning in REM
- Decisions are made for method selection and for
selecting new transitions within a method. - A decision state is a point in the reasoning
(i.e., task, method) plus a set of all decisions
which have been made in the past. - Initial Q values are set to 0.
- Decides on option with highest Q value or
randomly selects option with probabilities
weighted by Q value (configurable). - A decision receives positive reinforcement when
it leads immediately (without any other
decisions) to the success of the overall task.
48Monkey Bananas Tower of Hanoi Hybrid