Information Processing Technology Office - PowerPoint PPT Presentation

About This Presentation
Title:

Information Processing Technology Office

Description:

Title: Slide 1 Author: rlinn Last modified by: Pat Langley Created Date: 3/31/2004 4:11:22 PM Document presentation format: On-screen Show Company – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 13
Provided by: rli90
Learn more at: http://www.isle.org
Category:

less

Transcript and Presenter's Notes

Title: Information Processing Technology Office


1
Information Processing Technology
Office Learning Workshop April 12, 2004 Seedling
Overview Learning Hierarchical Reactive
Skills from Reasoning and Experience Institute
for the Study of Learning and Expertise PI Pat
Langley Presenter Ray Mooney
2
Learning Objective
  • Develop learning methods that operate over rich
    knowledge structures which
  • support both reactive control and problem solving
  • are embedded in an integrated cognitive
    architecture
  • that operates in complex physical environments
  • Learning mechanisms can acquire and revise such
    knowledge more rapidly and effectively than human
    programmers can create and debug it manually.

3
What is Being Learned?
  • ICARUS is an integrated cognitive architecture
    that learns
  • the logical structure of relational skills and
    concepts
  • a hierarchical organization over these elements
  • numeric utility functions attached to skills and
    concepts
  • that describe effective means for achieving goals
  • that support reactive control of physical agents
  • from background knowledge, experience with
    executing skills in the environment, and problem
    solving
  • in an incremental, cumulative manner that
    responds to changes in tasks and the environment.
  • This relies on the tight integration of
    execution, problem
  • solving, and learning.

4
What is Being Learned?
  • For example, in a driving domain, ICARUS would
    learn
  • the structure of driving skills like turning and
    passing
  • the structure of driving concepts (e.g.,
    passable)
  • hierarchical connections (e.g., pass and
    change-lanes)
  • how to achieve high-level goals (e.g., package
    delivery)
  • how to get from one place to another (route
    knowledge)
  • the expected utility of driving skills and
    subskills
  • the expected utility of driving concepts
  • This different content is cast within a unified
    formalism
  • that ICARUS provides for encoding knowledge.

5
How is Knowledge Being Learned?
  • The ICARUS architecture learns
  • value functions using a hierarchical variant on
    model-based reinforcement learning
  • new skills and concepts based on the cached
    results of means-ends problem solving.
  • Learning and reasoning are integrated in that
  • conceptual inference and hierarchical skills
    provide high-level descriptions for reinforcement
    learning
  • problem-solving traces form the basis of new
    skills and concepts.
  • Learning is automatic but could be adapted to
    benefit
  • from advice and traces of expert behavior.
  • Structure learning occurs from single instances
    value
  • learning should be much faster than in typical
    methods.

6
How is Knowledge Being Learned?
Means-ends analysis produces hierarchical skills
E
D
C
10
11
8
9
1
B
A
6
7
2
3
5
4
7
How is the Knowledge Represented?
  • ICARUS casts both background and learned
    knowledge as
  • logical relational concepts with linear value
    functions
  • logical relational skills with linear value
    functions
  • that are defined in terms of other skills and
    concepts.
  • Background knowledge constrains the learning of
    value
  • functions and provides components for structure
    learning.
  • ICARUS provides a formalism for encoding
    knowledge
  • about physical domains with continuous attributes
  • that includes probability of success, expected
    duration, and resource requirements
  • described at multiple levels of abstraction over
    both state (with concepts) and time (with skills).

8
How is the Knowledge Represented?
(make-right-turn (?self ?corner)
objective ((behind-right-corner ?corner))
start ((in-rightmost-lane ?self)
(ahead-right-corner ?corner) (at-turning-distanc
e ?corner)) requires ((near-block-corner
?corner) (at-turning-speed ?self))
ordered ((begin-right-turn ?self ?corner)
(end-right-turn ?self ?corner)) value (30.0) )
(slow-for-intersection (?self)
percepts ((self ?self speed ?speed) (corner
?corner street-dist ?dist)) objective ((slow-en
ough-intersection ?self)) requires ((near-block-
corner ?corner) actions ((slow-down))
value ( ( -5.2 ?dist) ( 20.3 ?speed)) )
Some ICARUS driving skills
9
How is the Knowledge Represented?
(corner-ahead-left (?corner) percepts ((corner
?corner r ?r theta ?theta)) tests ((lt ?theta
0) (gt ?theta -1.571)) value (( ( 5.6 ?r)
( 3.1 ?theta)) ) (in-intersection (?self)
percepts ((self ?self) (corner ?ncorner
street-dist ?sdist)) positives ((near-block-corn
er ?ncorner) (corner-straight-ahead ?scorner))
negatives ((far-block-corner ?fcorner))
tests ((lt ?sdist 0.0)) value (-10.0) )
Some ICARUS driving concepts
10
What is the Domain?
  • Our initial studies of ICARUS have focused on a
    simulated
  • in-city driving environment that
  • requires integration of perception, action, and
    cognition
  • involves both reactive control and goal direction
  • supports many distinct tasks of varying
    complexity
  • provides clear opportunities for cumulative
    learning
  • The environment lets us vary domain
    characteristics
  • systematically and record statistics on agent
    behavior.
  • However, ICARUS aims at broad generality and
    should
  • support reasoning and learning in
  • both first-person and strategy games
  • crisis-response tasks involving physical response
  • intelligent assistants for office activities

11
How is Progress Being Measured?
  • Dependent variables
  • Efficiency of task execution (e.g., driving time)
  • Quality of task execution (e.g., gas used,
    accidents)
  • Higher-order metrics
  • Rate and asymptote of learning curves
  • Transfer to related tasks and altered
    environments
  • Independent variables
  • Inclusion or omission of learning methods
  • Amount of background knowledge available
  • Task difficulty and environmental complexity
  • Amount of experience, task/environment similarity

12
What are the Technical Milestones?
  • Year 1
  • Learn estimates of driving skills duration and
    success
  • Learn higher-level driving skills via problem
    solving
  • Demonstrate improvement on multi-package delivery
  • Year 2
  • Learn value functions for driving concepts
  • Learn trade-offs among many high-level tasks
  • Demonstrate transfer and scaling to complex tasks
  • Year 3
  • Acquisition of place and route knowledge
  • Support episodic memory and perceptual attention
  • Demonstrate cumulative learning and change
    resilience
  • We will also examine other domains to ensure
    generality.
Write a Comment
User Comments (0)
About PowerShow.com