Title: Latent Problem Solving Analysis (LPSA): A computational theory of representation in complex, dynamic problem solving tasks
1Latent Problem Solving Analysis (LPSA) A
computational theory of representation in
complex, dynamic problem solving tasks
2Complex problem solving (CPS) definition
- dynamic, because early actions determine the
environment in which subsequent decision must be
made, and features of the task environment may
change independently of the solvers actions - time-dependent, because decisions must be made at
the correct moment in relation to environmental
demands and - complex, in the sense that most variables are
not related to each other in one-to-one manner
3- Despite 10 years of research in the area, there
is neither a clearly formulated specific theory
nor is there an agreement on how to proceed with
respect to the research philosophy. Even worse,
no stable phenomena have been observed -
- (Funke, 1992, p. 25)
4"How similar are two participant's solutions?"
- For CPS there is no common, explicit theory to
explain why a complex, dynamic situation is
similar to any other situation or how two slices
of performance taken from a problem solving task
can possibly be compared quantitatively. - This lack of formalized, analytical models is
slowing down the development of theory in the
field.
5Example of a complex, dynamic task Firechief
(Omodei and Wearing 1995)
6(No Transcript)
7Problems with the classic 'problem space
approach!
- Most of the theories about cognitive skill
acquisition and procedural learning are based in
two principles - The problem space hypothesis
- Representation of procedures as productions
8Problems with the classic 'problem space
approach!
- The problem with the generation of the problem
space - The utility of the state space representation for
tasks with inner dynamics is reduced because in
most CPS environments it is not possible to undo
the actions, and prepare a different strategy
9Problems with the classic 'problem space
approach!
- The classic problem solving theory used mainly
verbal protocols as data. However, TALK ALOUD
INTERFERES PERFORMANCE IN COMPLEX DYNAMIC TASKS
(Dickson, McLennan Omodei, 2000) - Independence (or very short-term dependences) of
actions/states is assumed in some of the methods
for representing performance. That is, the
features that represent performance are local
10What is LPSA and how it relates to these problems
and other theories
11Latent Problem solving Analysis(LPSA)
- m(trial) fm(sa1), m(sa2),.. m(san), context
- Simplifying assumptionsm(trial1) m(sa11)
m(sa21) .. m(san1) m(trial2) m(sa12)
m(sa22) .. m(san2). m(trialk) m(sa1k)
m(sa2k) .. m(sank) - Where sa is a state or action
12Latent Problem solving Analysis(LPSA)
- Complexity reduction Reducing the number of
dimensions in the space reduces the noise
13LSA
LPSA
The problem space is a metric space, where states
and trials are represented as vectors
14LPSA as a theory of representation in CPS tasks
- Applications Automatic landing technique
assessment
- Expertise effects of amount of practice
- Expertise effects of amount of environmental
structure
- human similarity judgments
- Strategy changes
15Approaches to complexity The ant and the beach
parable (Simon, 1967,1981)
16Approaches to complexity The ant and the beach
parable (Simon, 1967,1981)
17Approaches to complexity The ant and the beach
parable (Simon, 1967,1981)
18Approaches to complexity The ant and the beach
parable (Simon, 1967,1981)
?
19- Unsupervised learning
- Empirical adjustment of a problem space
- Definition of a productivity mechanism and a
similarity measure. - LPSA addition and cosine.
20LPSA solutions for the problems with the classic
'problem space approach
- The problem with the generation of the problem
space - The utility of the state space representation for
tasks with inner dynamics is reduced because in
most CPS environments it is not possible to undo
the actions, and prepare a different strategy
LPSA proposes a mechanism to generate
automatically the problem space
21LPSA solutions for the problems with the classic
'problem space approach
- The classic problem solving theory used mainly
verbal protocols as data. However, TALK ALOUD
INTERFERES PERFORMANCE IN COMPLEX DYNAMIC TASKS
(Dickson, McLennan Omodei, 2000) - Independence (or very short-term dependences) of
actions/states is assumed in some of the methods
for representing performance. That is, the
features that represent performance are local
LPSA uses log files and human judgments as data,
but not concurrent verbal protocols
LPSA does not assume independence or short
dependences between states/actions. Indeed, it
uses the dependences of all of them
simultaneously to derive the problem space. The
features that represent performance are global
22Theoretical surroundings of Latent Problem
Solving Analysis
23(No Transcript)
24Anderson (1978)
- Encoding processes
- Processes of internal transformation
- Decoding processes
25LPSA applied to model human judgments
26Main equivalence
Actions Move_4_Copter_11_4_11_9_Forest_
Words
1 Move_4_Copter_11_4_11_9_Forest_ 2
Move_2_Truck_4_11_17_7_Clearing_ 3
Drop_Water_4_Copter_11_9_Forest___ 4
Move_3_Copter_8_6_10_11_Forest_ 5
Move_1_Truck_4_14_18_10_Forest_ 6
Drop_Water_3_Copter_10_11_Forest___ 7
Move_4_Copter_11_9_21_8_Dam_ 8 Move_3_Copter_10_11
_12_14_Dam_ 9 Control_Fire_2_Truck_17_7_Clearing__
_ 10 Control_Fire_1_Truck_18_10_Forest___ 11
Move_4_Copter_21_8_12_10_Clearing_
Participants trials
Docs
27Firechief corpus
- Data from the experiments described in
experiments 1 and 2 in Quesada et al. (2000), and
Canas et al. (2003). - Total 3441 trials, 75.575 different actions
- The first 300 dimensions where used
28Trial 1
Trial 2
Trial 3
log files containing series of actions
Action 1
Action 2
57000 actions 3400 log files
actions
29Three examples of performance
- 8 first actions in a trial
2
1
RELATED
NON RELATED
3
301
0
CONTROL FIRE
1
2
3
4
5
6
7
8
9
10
11
CONTROL FIRE
12
13
14
15
1
2
3
4
5
6
7
8
0
10
11
9
12
13
14
15
16
17
18
19
20
21
22
23
24
312
0
1
2
3
4
5
6
7
8
DROP WATER
9
10
11
CONTROL FIRE
12
CONTROL FIRE
13
14
15
1
2
3
4
5
6
7
8
0
10
11
9
12
13
14
15
16
17
18
19
20
21
22
23
24
323
CONTROL FIRE
0
1
2
CONTROL FIRE
3
4
5
6
7
8
9
10
11
12
13
14
15
1
2
3
4
5
6
7
8
0
10
11
9
12
13
14
15
16
17
18
19
20
21
22
23
24
33(No Transcript)
34Possible way of comparison Exact matching of
actions
- Exact matching count the number of common
actions in two files. The higher this number, the
more similar they are
35(No Transcript)
36(No Transcript)
37Possible way of comparison Transitions between
actions
- count the number transitions between actions in
two files. Create matrices, and correlate them
38(No Transcript)
39(No Transcript)
40(No Transcript)
41(No Transcript)
42Possible way of comparison Transitions between
actions
43(No Transcript)
44(No Transcript)
45LPSA - Human Judgments correlation
46Human Judgment correlation
- if LSA captures similarity between complex
problem solving performances in a meaningful way,
any person with experience on the task could be
used as a validation - To test our assertions about LSA, we recruited 15
persons and exposed them to the same amount of
practice as our experimental participants, so
they could learn the constraints of the task.
47Human Judgment correlation
- Replay trials, with different similarities
- People watched a randomly ordered series of
trials, in a different order for each
participant, which were selected as a function of
the LSA cosines (pairs A, B, C, D, E, F, G with
cosines 0.75, 0.90, 0.53, 0.60, 0.12 and 0.06
respectively)
48Human Judgment correlation
- One of the pairs was presented twice to measure
test-retest reliability. That is, for example,
pair G was exactly the same as pair A for one
participant, the same as pair F for another
participant, etc. Filling out a form that
presented all the possible pairings of stimuli
pairs were presented
49Human Judgment correlation
FULL-SCREEN REPLAY OF THE TRIAL SELECTED, 8 TIMES
FASTER THAN NORMAL SPEED
50Human Judgment correlation Results
51Conclusions
- Applied LSA is an automatic way of generating a
problem space and compare slices of performance
in complex tasks. It scales up very well and does
not depend on a-priori task analyses - Theoretical LSA proposes that problem spaces are
metric spaces that are derived from experience.
Actions or States that are functionally related
are represented in similar regions of the space.
In this sense Problem solving is unified with
theories of object recognition and semantics.
52LPSA as a theory of expertise in problem solving
53- Ebbinghaus approach manipulating previous
knowledge by eliminating it. Random assignment of
participants to groups. - Chase and Simon approach (expert novice),
manipulating previous knowledge by pre -
selecting participants (no random assignment of
participants to groups) - Move complexity to the lab, and manipulate
previous knowledge (exactly amount of practice
and experience for all participants)
54- Ebbinghaus approach manipulating previous
knowledge by eliminating it. Random assignment of
participants to groups. - Chase and Simon approach (expert novice),
manipulating previous knowledge by pre -
selecting participants (no random assignment of
participants to groups) - Move complexity to the lab, and manipulate
previous knowledge (exactly amount of practice
and experience for all participants)
55Move complexity to the lab
- To simulate expertise environments in labs, we
need tasks more complex than the standard ones - More representative
- Long learning curve
- Interesting enough to keep the motivation for a
long period of time
56The DURESS Microworld
- Goals
- To keep each of the reservoir temperatures (T1
and T2) at a prescribed temperature ( e.g., 40 C
and 20 C, respectively) - To satisfy the current mass (water) output demand
( 5 liters by second and 7 liters by second,
respectively)
57(No Transcript)
58(No Transcript)
59(No Transcript)
60(No Transcript)
61(No Transcript)
62(No Transcript)
63DURESS
- Christoffersen Hunter, Vicente (1996, 1997,
1998) 6-month long longitudinal experiment using
Duress II. 225 trials, with different goals
values. Every participant received exactly the
same kind of trials. - However, analysis mostly qualitative. Not without
a good reason
64Example DURESS II protocol
34 variables, governed by mass and energy
conservation laws
65Main equivalence
states TR1_TR2_MO1_MO2_ () 40 _ 20 _ 15 _
7_ ()
Words
35_10_15_6_ () 35_10_15_6_ () 36_12_15_6_
() 36_12_15_6_ () 36_13_15_6_ ()
38_15_15_7_ () 38_15_15_7_ () 39_18_15_7_
() 40_18_15_7_ () 40_20_15_7_ ()
40_20_15_7_ ()
Participants trials
Docs
66Trial 1
Trial 2
Trial 3
log files containing series of States
State 1
State 2
57000 States 1151 log files
States
67Current theories of expertise
- Constraint Attunement Hypothesis (CAH)
- Vicente and Wang (1998)
- Long Term Working Memory (LTWM)
- Ericsson and Kintsch (1995)
- EPAM IV
- (e.g., Gobet, Richman, Staszewski and Simon,
1997)
68Current theories of expertise
- Constraint Attunement Hypothesis (CAH)
- Vicente and Wang (1998)
- Long Term Working Memory (LTWM)
- Ericsson and Kintsch (1995)
- EPAM IV
- (e.g., Gobet, Richman, Staszewski and Simon,
1997)
PRODUCT THEORY
PROCESS THEORIES
69- Ebbinghauss approach manipulating previous
knowledge by constancy (0). Random assignment of
participants to groups. - Chase and Simon approach (expert novice),
manipulating previous knowledge by pre -
selecting participants (no random assignment of
participants to groups) - Move complexity to the lab, and manipulating
previous knowledge by constancy ( Exact amount
of practice and experience for all participants).
70LTWM (Ericsson and Kintsch, 1995)
- STM accounts for working memory in unfamiliar
activities but does not appear to provide
sufficient storage capacity for working memory in
skilled complex activities (p.220) - LTWM is acquired in particular domains to meet
specific demands imposed by a given activity on
storage and retrieval. LTWM is task specific.
71LTWM (Ericsson and Kintsch, 1995)
- Intense practice in a domain creates retrieval
structures associations between the current
context and some parts of LTM that can be
retrieved almost immediately without effort
(example SF and digits). - LTWM permits rapid and reliable reinstantiation
of a context after interruption without a
decrease in performance.
72LTWM (Ericsson and Kintsch, 1995)
- LTWM theory proposes that LTWM is generated
dynamically by the cues that are present in short
term memory. - During text comprehension, where the average
human adult is an expert, retrieval structures
are retrieving propositions from LTM and merging
them with the ones derived from text.
73CAH (Vicente and Wang, 1998)
- Contrary to what process theories maintain,
Constrain Attunement Hypothesis (CAH) does not
commit to a particular psychological mechanism to
explain the phenomenon of expertise. - How should one represent the constrains that the
environment (i.e., the problem domain) places on
expertise? - Under what conditions will there be an expertise
advantage? - What factors determine how large the advantage
can be?
74CAH (Vicente and Wang, 1998)
- Describing the constraints in the environment is
the task of an expertise theory.
75CAH (Vicente and Wang, 1998) the Abstraction
Hierarchy
Win the game
PURPOSE
Score at least 2 runs in this inning
STRATEGIES
TACTICS
Advance all by one base
Alternative tactics to achieve strategy above
FUNCTIONS
Hit
Run
Run
PLAYERS
Batter
1st base runner
2nd base runner
76CAH (Vicente and Wang, 1998) the Abstraction
Hierarchy
Overall system goals (how much water each
reservoir is outputting, and at which temperature)
FUNCTIONAL
'D1','D2','T1','T2'
conservation of mass and energy for each
reservoir (how much mass energy is entering and
leaving the reservoir).
'MI1', 'MO1', 'EI1', 'EO1', 'M1', 'E1',
ABSTRACT
'FA','FA1','FA2','HTR1
GENERALIZED
Flows and storage of heat
PHYSICAL
Settings of valves, pumps, and heaters
'PA','PB','VA','VA1','VA2,
Continuum of abstraction, means- ends
relationship between levels
77CAH (Vicente and Wang, 1998) the Abstraction
Hierarchy
Overall system goals (how much water each
reservoir is outputting, and at which temperature)
FUNCTIONAL
'D1','D2','T1','T2'
conservation of mass and energy for each
reservoir (how much mass energy is entering and
leaving the reservoir).
'MI1', 'MO1', 'EI1', 'EO1', 'M1', 'E1',
ABSTRACT
'FA','FA1','FA2','HTR1
GENERALIZED
Flows and storage of heat
PHYSICAL
Settings of valves, pumps, and heaters
'PA','PB','VA','VA1','VA2,
78LTWM vs. CAH
- LTWM claims that the magnitude of expertise
effects is related to the level of attained
skill and to the amount of relevant prior
experience - CAH argues that this claim is incomplete.
Expertise effects in memory recall are also
determined by the amount of structure in the
domain (and by active attunement to that
structure) - LPSA is sensible both to relevant previous
practice and to amount of structure in the
domain
79Design and predictions
803/4
1/4
?
813/4
1/4
?
82(No Transcript)
83Predictions
- Only huge amounts of experience with the system
would enable the actor (human or model) to make
accurate predictions of the last quarter of the
trial - Sparse practice should clearly lead to poor
prediction - Only structured environments should show the
expertise advantage. Following CAH, the expert
(human or model) should not do well in a
completely unstructured environment
84Results
85Three years of experience with DURESS
Average cosine between the fourth quarter of a
target trial and the fourth quarter of the 10
nearest Neighbors When the three first quarters
are used to retrieve the neighbors
86Six months of experience with DURESS
Average cosine between the fourth quarter of a
target trial and the fourth quarter of the 10
nearest Neighbors When the three first quarters
are used to retrieve the neighbors
87Three year of experience in a DURESS with no
constraints (random states)
Average cosine between the fourth quarter of a
target trial and the fourth quarter of the 10
nearest Neighbors When the three first quarters
are used to retrieve the neighbors
88Conclusions
89conclusions
- In LTWMs original formulation the retrieval
structures were under-specified. In LPSA, the
basic mechanisms postulated are defined
computationally. - In CAHs original formulation, the representation
of the environmental constraints (its most
central assertion) where under-specified too.
LPSA proposes an automatic mechanism to represent
the statistical regularities of the environment.
90conclusions
- LPSA can explain both LTWM and CAH main
assertions - LTWM claims that the magnitude of expertise
effects is related to the level of attained skill
and to the amount of relevant prior experience - CAH claims that expertise effects in memory
recall are also determined by the amount of
structure in the domain (and by active attunement
to that structure) - Better yet, LPSA proposes both processes and
representational structures
91conclusions
- What does this mean for theorizing about problem
solving? - As in LTWM for text comprehension, we propose
that in expert problem solving the current
context automatically and effortless retrieve
past knowledge, and adapt it to the current
situation. - This retrieval is specific to the domain of
expertise, and requires a long period of
practice. Short period will not do. - This retrieval is only possible in domains that
show constrains that the expert can use (attune).
92conclusions
- GENERALITY the fact that the same mechanism,
with the very same underlying assumptions, can be
used for language and Problem Solving is
interesting per-se In LTWM, the retrieval
structures for chess are different compared to
the ones proposed for text comprehension In CAH,
two AH for two different tasks are different too
In LPSA, any space for any task is a vector
space.
93Automatic Landing Technique Assessment using
Latent Problem Solving Analysis (LPSA)
94The problem
- There is currently no methodology to
automatically assess landing technique in a
commercial aircraft or a flying simulator.
Instructors are a significant cost for training
and evaluation of pilots, and the use of
instructors also incorporates a subjective
component that may vary from pilot to pilot. - The advantages of automatic landing technique
evaluation are many (1) Reduced cost of the
evaluation. (2) Increased objectivity in the
evaluation. (3) Decrease the influence of the
instructor. (4) Perfect Test-retest reliability.
(5) It is always available and can be triggered
by the trainee at will. (6) The model can rate as
many landings as time enables, etc.
95A solution Latent Problem Solving Analysis (LPSA)
- Latent Problem Solving Analysis (LPSA, Quesada,
Kintsch and Gomez, 2002) is based on Latent
Semantic Analysis (LSA, Landauer and Dumais,
1997) . Instead of using word occurrence
statistics and huge samples of text, LPSA uses a
representative amount of activity in controlling
dynamic systems (actions or states). - Like words, states and actions appear in
particular contexts but not in others. Some
states and actions are interchangeable, being
functional synonyms. Given the right algorithms
and sufficient amounts of logged trials, a
problem space can be derived in a similar way as
semantic spaces are. - In this application of LPSA to landing technique
evaluation, we assume that an expert uses her
past knowledge to emit landing ratings by
comparing the current situation to the past ones,
and generates an expanded representation of the
environment by composing the past situations that
are most similar to the current one.
96(No Transcript)
97Complex, dynamic tasks are intractable when
considered as a whole
98Complex, dynamic tasks are intractable when
considered as a whole
- We need to perform complexity reduction, in a
mostly automatic way - The triangulation technique
- Dimensionality reduction (LPSA)
99The triangulation technique
100Complexity reduction (I) variable selection
using differently informed experts
101Criteria used by the experts Levels Levels Levels Levels Levels
Flare Initiation altitude Too high Correct Too low
Thrust Reduction Too fast Correct Too slow
Pitch Angle All the way too high Partly too high Correct Partly too low All the way too low
Overall Landing Score 1 2 3 4 5
102Complexity reduction (II) Using SVD, the
problem space is a vector space
- A state is a string of text consisting of the
values of each variable (reduced information
experts) joined by underscores, to make it a
single token, like - time tag_vertical acceleration_Radio
altitude_Thurst - A landing is a collection of these states. The
variables were sampled ten times per second, and
the landing time was 15 seconds approximately, so
each landing contained about 150 states
103Complexity reduction (II) Using SVD, the
problem space is a vector space
104(No Transcript)
105Results
106Results
107Results
- The two landing raters agreement is not too
high however, it is similar to other experts
agreement, such as Clinical Psychologists (0.40),
Stockbrokers (lt0.32), Polygraphers (0.32) and
Livestock Judges (0.50). Their agreement is
lower than the ones reported for Weather
Forecasters (0.95), Pathologists (0.55), Auditors
(0.76) and Grain Inspectors (0.60) (Shanteau.
2001). - The correlation between the model, and the
reduced information expert is about the same as
the correlation between the two humans (0.48 vs.
0.46). Note that the ceiling for the model is the
correlation between two humans doing the task a
model that correlates with one human better than
the between-human correlation is under suspicion.
The correlation for the complete information
expert was .39, even though the model was not
trained to mimic him.
108Results
109Results
- Note that the only criterion where the model
correlates with any of the experts more than they
correlate to each other is thrust reduction.
Thrust reduction seems to be a very difficult
feature to judge, since the agreement between
human experts is the lowest (0.27) and also it is
the one in which the reduced information expert
obtains the lowest test-retest reliability
(0.538, see Table 1 4 in page 119). - All the polychoric correlations between the
reduced information expert and the model were
significant (p .002), so were the correlations
between complete information expert and model. - The equivalent model without dimensionality
reduction (400 dimensions, 5 neighbors, no
weighting, no timestamp) produced correlation
values of 0.37, 0.08, 0.57 and 0.50 for the above
used criteria respectively.
110Results no-constraints corpus
111Conclusions
- Previously LPSA has been proved as a powerful
theory to model behavior in complex, dynamic
problem-solving tasks, and has been proposed as a
theory of expertise, see Quesada (unpublished).
However, this is the first time that LPSA is used
to develop technology that can be used in
industrial applications. - In previous work, we have presented an
experience-based approach to problem solving.
Problem solving is viewed as the extraction of
useful representations from a corpus of
situations. The creation of the representations
is a primarily bottom-up, unsupervised process.
It is proposed that the problem space can be
viewed as a vector space. People use their past
knowledge to perform complex, dynamic tasks by
comparing the current situation to past ones, and
generate an expanded representation of the
environment by composing the past situations that
are most similar to the current one. In complex
dynamic situations, this intuitive, pattern-based
system can have a very important role.
112Conclusions
- It is possible to construct systems that grade
landing technique automatically as well as
humans, if we consider that the limit of
performance for such a model is the human-human
agreement. The correlation human-human was low
(0.46) but within the range of some other areas
reported (Shanteau, 2001). In a large-scale
application of the model (for a training and
evaluation department, for example), we can
imagine that 500 pilots need to be evaluated. In
that situation, only a small proportion of
randomly sampled landings (that can be kept from
previous sessions) must be evaluated by humans
the rest is performed by the system. Since the
model has different landing criteria, it could
emit recommendations such as In this landing,
you initiated the flare too high, and reduced the
thrust too late. Keep that in your mind for the
next one.
113Conclusions
- A direct consequence of the availability of a
system like LPSA for the development of
psychological theory is that some experiments
that were prohibitive before could now be planned
within the budget. Since instructors are a sparse
resource, an experimenter may decide that she
cannot afford to run a particular, very promising
experiment, because of the expenses associated
with performance assessment. With an automatic
and reliable method to perform the evaluation,
more complex experiments could be feasible.