Title: Mixing Automatic and Deliberative Learning During Problem Solving
1Mixing Automatic and Deliberative Learning During
Problem Solving
- Randolph M. Jones
- Soar Technology
- Colby College
2Background
- There are alternative ways we might incorporate
multi-step learning into a model - One approach would be to automate explicit
instruction of desired task behavior - Even this is difficult
- This talk focuses on models that can discover new
problem-solving knowledge and strategies on their
own
3Knowledge Tuning and Acquisition
- There are two primary ways a model can learn new
strategies - Acquiring new task knowledge that allows more
complete or efficient coverage of a problem space - Tuning existing task knowledge so it is retrieved
more oportunistically - Knowledge acquisition in its own right is also
important - But this work suggests that knowledge acquisition
depends on knowledge tuning
4Knowledge Tuning
- Basic representational structure of knowledge
chunk remains unchanged - Retrieval/selection patterns associated with the
knowledge do change
5Knowledge Acquisition
- Entirely new structured representations of
long-term knowledge are added to the models
knowledge base - Or existing chunks of knowledge undergo
structural changes
6Task Example Solving Physics Problems
- Learning to solve physics problems involves
learning new equations relevant to the problems,
and learning the situations in which those
equations should be used - Students who self-explain study examples show
greater improved performance than those who dont
(Chi et al., 1989) - Are they tuning knowledge or acquiring knowledge?
- Cascade (VanLehn, Jones, Chi, 1992) models the
self-explanation effect observed in humans
learning to solve physics problems
7Task Example Simple Addition
- There are a variety of strategies that can be
used to perform elementary addition, some more
efficient than others - Children are usually instructed using a basic
strategy, but invent a particular set of more
efficient strategies on their own (Siegler
Jenkins, 1989) - Are they tuning knowledge or acquiring knowledge?
- GIPS (Jones VanLehn, 1994) models the series of
strategy shifts exhibited by children
8Cascade Typical Problem
What is the tension in the string?
9Cascade Typical Problem
A
B
C
What is the magnitude of each force?
10Cascade Typical Example
- Let the knot be the body
- FA, FB, FC are all the forces acting on the body
- The body is at rest, so FAFBFC0
- By projection, FAXFBX0
- By projection, FAYFBYFCY0
- FAXFA cos 30? 0.8666FA
- etc.
FB
FA
FC
11Cascade Modeling Goal
- Explain the learning process and other factors
that cause students who carefully study examples
to learn more effectively than students who do not
12Cascade Knowledge Representation
- Long-term task knowledge is a set of physics
equations, geometric equations, and rules for
representing free-body diagrams - Implemented in Prolog
- Default problem-solving strategy is exhaustive
depth-first search with backtracking - Straightforward application of Prolog
- Problem-solving goals are quantities (variables)
for which the problem solver must compute a value - Selection knowledge allows heuristic search by
using past solution paths as analogies to the
current problem
13Cascade Learning Processes
- Knowledge tuning
- Analogical Search Control
- When Cascade succeeds in computing a value for a
sought quantity, it records a triple including
the name of the problem, the sought quantity, and
the equation that was used to compute the value - The caching process occurs automatically and
frequently, every time a subgoal is achieved - On subsequent problems, Cascade
- Attempts to map the current problem quantities
and relations to the analog problem - Searches for cached triples that mention problem
analogs to the current problem, together with an
analogous sought quantity - Attempts the retrieved equation before falling
back on the default ordering of knowledge (if
backtracking occurs)
14Cascade Learning Processes
- Knowledge acquisition
- Explanation-based Learning of Correctness
- If Cascade cannot solve a problem (after
exhaustive search), it begins the search again,
this time attempting a repair at the first
point that backtracking is encountered - Repairs occur by attempting to apply relevant
overly general rules to the problem - On success, Cascade stores a specialization of
the overly general rule with the rest of the task
knowledge - The rule learning process occurs deliberatively
and infrequently, only after the model has
recognized an impasse in problem solving
15Cascade Learning Interactions
- Knowledge acquisition only works if the model is
repairing the right gap in a potential solution
space - The model can be guided toward the right gap
- By the directions in a worked example
- By the quality of knowledge tuning
16Cascade Learning Interactions
17Cascade Experimental Results
- No Analogical Search Control
- Learns 3 correct rules
- Solves 9 problems correctly
- No EBLC on examples
- Learns 13 correct rules
- Learns 4 incorrect rules
- Solves 21 problems correctly (many using a backup
transformational analogy strategy) - ASC EBLC
- Learns 22 correct rules
- Solves 23 problems correctly
18GIPS Typical Problem
Sum Strategy
19GIPS Modeling Goal
- Model how children independently invent the Min
strategy with experience - Min is a more efficient strategy, suggesting
that it may be produced primarily by knowledge
tuning - However, there appear to be structural changes to
the steps the children are taking to solve
problems
20GIPS Knowledge Representation
- Task knowledge is represented as STRIPS-like
operators with preconditions, constraints, add
conditions, and delete conditions - Problem-solving algorithm is flexible
means-ends analysis - TRANSFORM goal Use features describing current
state and goal to retrieve a candidate operator
to APPLY for the next step in the transformation - APPLY goal Execute the operator if possible,
else set up a new TRANSFORM to the preconditions
of the operator - Retrieval/selection knowledge is encoded as
probability estimates (for logical sufficiency
and logical necessity) attached to each potential
triggering feature for each operator - State and Goal relations
21Example Bayesian Concept
- Liftable
- FEATURE LS LN
- size is small 3.0 0.0
- weight is light 2.0 0.3
- has handle 2.0 0.3
- attached to floor 0.0 3.0
- color is red 1.0 1.0
- Note this example has propositional features, but
features in GIPS are relational - GIPS uses a graph-based maximal partial match
procedure to map combinations of relations to
propositions
22GIPS Learning Processes
- Knowledge tuning
- Every time an APPLY goal leads to success or
failure, GIPS updates the appropriate probability
estimates for each state and goal feature present
when the APPLY goal was created - A Action A is the right thing to do next
- F Feature F is true in the problem situation
- A similar process occurs every time an operator
executes (or not)
23GIPS Learning Processes
- Knowledge acquisition
- When feature values for an operators execution
concept receive particularly strong logical
necessity values, a deliberative process
explicitly adds the new feature as a condition of
the operator - Another process removes features from the
operator conditions
24The SUM-to-MIN Strategy Shift
25GIPS Learning Interactions
- Bayesian updates happen continuously and
automatically, leading to performance shifts
based on retrieval of operators - Based on accumulating evidence, the model
periodically tries more drastic structural
changes to operator preconditions, which have
larger effects on subsequent retrieval patterns
(because operator preconditions are used as
subgoal retrieval cues and determine satisfaction
of APPLY goals)
26Lessons
- It is difficult to acquire new knowledge without
first tuning old knowledge - Tuning old knowledge implies that you have some
old knowledge to tune - For complex learning, we need to focus on
learning in the context of significant prior
knowledge - Tuning can help guide the search for building new
operators (Cascade) as well as for adjusting the
structural representations of existing operators
(GIPS) - You only want to acquire new knowledge after you
have accumulated some evidence (from tuning) that
the knew knowledge is appropriate and useful