Latent Problem Solving Analysis (LPSA): A computational theory of representation in complex, dynamic problem solving tasks

About This Presentation

Title:

Latent Problem Solving Analysis (LPSA): A computational theory of representation in complex, dynamic problem solving tasks

Description:

Latent Problem Solving Analysis (LPSA): A computational theory of representation in complex, dynamic problem solving tasks Complex problem solving (CPS) definition ... – PowerPoint PPT presentation

Number of Views:279

Avg rating:3.0/5.0

Slides: 114

Provided by: JoseQu4

Category:

more less

Transcript and Presenter's Notes

Title: Latent Problem Solving Analysis (LPSA): A computational theory of representation in complex, dynamic problem solving tasks

1
Latent Problem Solving Analysis (LPSA) A
computational theory of representation in
complex, dynamic problem solving tasks
2
Complex problem solving (CPS) definition

dynamic, because early actions determine the
environment in which subsequent decision must be
made, and features of the task environment may
change independently of the solvers actions
time-dependent, because decisions must be made at
the correct moment in relation to environmental
demands and
complex, in the sense that most variables are
not related to each other in one-to-one manner

Despite 10 years of research in the area, there
is neither a clearly formulated specific theory
nor is there an agreement on how to proceed with
respect to the research philosophy. Even worse,
no stable phenomena have been observed
(Funke, 1992, p. 25)

4
"How similar are two participant's solutions?"

For CPS there is no common, explicit theory to
explain why a complex, dynamic situation is
similar to any other situation or how two slices
of performance taken from a problem solving task
can possibly be compared quantitatively.
This lack of formalized, analytical models is
slowing down the development of theory in the
field.

5
Example of a complex, dynamic task Firechief
(Omodei and Wearing 1995)
6
(No Transcript)
7
Problems with the classic 'problem space
approach!

Most of the theories about cognitive skill
acquisition and procedural learning are based in
two principles
The problem space hypothesis
Representation of procedures as productions

8
Problems with the classic 'problem space
approach!

The problem with the generation of the problem
space
The utility of the state space representation for
tasks with inner dynamics is reduced because in
most CPS environments it is not possible to undo
the actions, and prepare a different strategy

9
Problems with the classic 'problem space
approach!

The classic problem solving theory used mainly
verbal protocols as data. However, TALK ALOUD
INTERFERES PERFORMANCE IN COMPLEX DYNAMIC TASKS
(Dickson, McLennan Omodei, 2000)
Independence (or very short-term dependences) of
actions/states is assumed in some of the methods
for representing performance. That is, the
features that represent performance are local

10
What is LPSA and how it relates to these problems
and other theories
11
Latent Problem solving Analysis(LPSA)

m(trial) fm(sa1), m(sa2),.. m(san), context
Simplifying assumptionsm(trial1) m(sa11)
m(sa21) .. m(san1) m(trial2) m(sa12)
m(sa22) .. m(san2). m(trialk) m(sa1k)
m(sa2k) .. m(sank)
Where sa is a state or action

12
Latent Problem solving Analysis(LPSA)

Complexity reduction Reducing the number of
dimensions in the space reduces the noise

13
LSA
LPSA
The problem space is a metric space, where states
and trials are represented as vectors
14
LPSA as a theory of representation in CPS tasks

Applications Automatic landing technique
assessment

Expertise effects of amount of practice
Expertise effects of amount of environmental
structure

human similarity judgments
Strategy changes

15
Approaches to complexity The ant and the beach
parable (Simon, 1967,1981)
16
Approaches to complexity The ant and the beach
parable (Simon, 1967,1981)
17
Approaches to complexity The ant and the beach
parable (Simon, 1967,1981)
18
Approaches to complexity The ant and the beach
parable (Simon, 1967,1981)
?
19

Unsupervised learning
Empirical adjustment of a problem space
Definition of a productivity mechanism and a
similarity measure.
LPSA addition and cosine.

20
LPSA solutions for the problems with the classic
'problem space approach

The problem with the generation of the problem
space
The utility of the state space representation for
tasks with inner dynamics is reduced because in
most CPS environments it is not possible to undo
the actions, and prepare a different strategy

LPSA proposes a mechanism to generate
automatically the problem space
21
LPSA solutions for the problems with the classic
'problem space approach

The classic problem solving theory used mainly
verbal protocols as data. However, TALK ALOUD
INTERFERES PERFORMANCE IN COMPLEX DYNAMIC TASKS
(Dickson, McLennan Omodei, 2000)
Independence (or very short-term dependences) of
actions/states is assumed in some of the methods
for representing performance. That is, the
features that represent performance are local

LPSA uses log files and human judgments as data,
but not concurrent verbal protocols
LPSA does not assume independence or short
dependences between states/actions. Indeed, it
uses the dependences of all of them
simultaneously to derive the problem space. The
features that represent performance are global
22
Theoretical surroundings of Latent Problem
Solving Analysis
23
(No Transcript)
24
Anderson (1978)

Encoding processes
Processes of internal transformation
Decoding processes

25
LPSA applied to model human judgments
26
Main equivalence
Actions Move_4_Copter_11_4_11_9_Forest_
Words
1 Move_4_Copter_11_4_11_9_Forest_ 2
Move_2_Truck_4_11_17_7_Clearing_ 3
Drop_Water_4_Copter_11_9_Forest___ 4
Move_3_Copter_8_6_10_11_Forest_ 5
Move_1_Truck_4_14_18_10_Forest_ 6
Drop_Water_3_Copter_10_11_Forest___ 7
Move_4_Copter_11_9_21_8_Dam_ 8 Move_3_Copter_10_11
_12_14_Dam_ 9 Control_Fire_2_Truck_17_7_Clearing__
_ 10 Control_Fire_1_Truck_18_10_Forest___ 11
Move_4_Copter_21_8_12_10_Clearing_
Participants trials
Docs
27
Firechief corpus

Data from the experiments described in
experiments 1 and 2 in Quesada et al. (2000), and
Canas et al. (2003).
Total 3441 trials, 75.575 different actions
The first 300 dimensions where used

28
Trial 1
Trial 2
Trial 3
log files containing series of actions
Action 1
Action 2
57000 actions 3400 log files
actions
29
Three examples of performance

8 first actions in a trial

2
1
RELATED
NON RELATED
3
30
1
0
CONTROL FIRE
1
2
3
4
5
6
7
8
9
10
11
CONTROL FIRE
12
13
14
15
1
2
3
4
5
6
7
8
0
10
11
9
12
13
14
15
16
17
18
19
20
21
22
23
24
31
2
0
1
2
3
4
5
6
7
8
DROP WATER
9
10
11
CONTROL FIRE
12
CONTROL FIRE
13
14
15
1
2
3
4
5
6
7
8
0
10
11
9
12
13
14
15
16
17
18
19
20
21
22
23
24
32
3
CONTROL FIRE
0
1
2
CONTROL FIRE
3
4
5
6
7
8
9
10
11
12
13
14
15
1
2
3
4
5
6
7
8
0
10
11
9
12
13
14
15
16
17
18
19
20
21
22
23
24
33
(No Transcript)
34
Possible way of comparison Exact matching of
actions

Exact matching count the number of common
actions in two files. The higher this number, the
more similar they are

35
(No Transcript)
36
(No Transcript)
37
Possible way of comparison Transitions between
actions

count the number transitions between actions in
two files. Create matrices, and correlate them

38
(No Transcript)
39
(No Transcript)
40
(No Transcript)
41
(No Transcript)
42
Possible way of comparison Transitions between
actions
43
(No Transcript)
44
(No Transcript)
45
LPSA - Human Judgments correlation
46
Human Judgment correlation

if LSA captures similarity between complex
problem solving performances in a meaningful way,
any person with experience on the task could be
used as a validation
To test our assertions about LSA, we recruited 15
persons and exposed them to the same amount of
practice as our experimental participants, so
they could learn the constraints of the task.

47
Human Judgment correlation

Replay trials, with different similarities
People watched a randomly ordered series of
trials, in a different order for each
participant, which were selected as a function of
the LSA cosines (pairs A, B, C, D, E, F, G with
cosines 0.75, 0.90, 0.53, 0.60, 0.12 and 0.06
respectively)

48
Human Judgment correlation

One of the pairs was presented twice to measure
test-retest reliability. That is, for example,
pair G was exactly the same as pair A for one
participant, the same as pair F for another
participant, etc. Filling out a form that
presented all the possible pairings of stimuli
pairs were presented

49
Human Judgment correlation
FULL-SCREEN REPLAY OF THE TRIAL SELECTED, 8 TIMES
FASTER THAN NORMAL SPEED
50
Human Judgment correlation Results
51
Conclusions

Applied LSA is an automatic way of generating a
problem space and compare slices of performance
in complex tasks. It scales up very well and does
not depend on a-priori task analyses
Theoretical LSA proposes that problem spaces are
metric spaces that are derived from experience.
Actions or States that are functionally related
are represented in similar regions of the space.
In this sense Problem solving is unified with
theories of object recognition and semantics.

52
LPSA as a theory of expertise in problem solving
53

Ebbinghaus approach manipulating previous
knowledge by eliminating it. Random assignment of
participants to groups.
Chase and Simon approach (expert novice),
manipulating previous knowledge by pre -
selecting participants (no random assignment of
participants to groups)
Move complexity to the lab, and manipulate
previous knowledge (exactly amount of practice
and experience for all participants)

Ebbinghaus approach manipulating previous
knowledge by eliminating it. Random assignment of
participants to groups.
Chase and Simon approach (expert novice),
manipulating previous knowledge by pre -
selecting participants (no random assignment of
participants to groups)
Move complexity to the lab, and manipulate
previous knowledge (exactly amount of practice
and experience for all participants)

55
Move complexity to the lab

To simulate expertise environments in labs, we
need tasks more complex than the standard ones
More representative
Long learning curve
Interesting enough to keep the motivation for a
long period of time

56
The DURESS Microworld

Goals
To keep each of the reservoir temperatures (T1
and T2) at a prescribed temperature ( e.g., 40 C
and 20 C, respectively)
To satisfy the current mass (water) output demand
( 5 liters by second and 7 liters by second,
respectively)

57
(No Transcript)
58
(No Transcript)
59
(No Transcript)
60
(No Transcript)
61
(No Transcript)
62
(No Transcript)
63
DURESS

Christoffersen Hunter, Vicente (1996, 1997,
1998) 6-month long longitudinal experiment using
Duress II. 225 trials, with different goals
values. Every participant received exactly the
same kind of trials.
However, analysis mostly qualitative. Not without
a good reason

64
Example DURESS II protocol
34 variables, governed by mass and energy
conservation laws
65
Main equivalence
states TR1_TR2_MO1_MO2_ () 40 _ 20 _ 15 _
7_ ()
Words
35_10_15_6_ () 35_10_15_6_ () 36_12_15_6_
() 36_12_15_6_ () 36_13_15_6_ ()
38_15_15_7_ () 38_15_15_7_ () 39_18_15_7_
() 40_18_15_7_ () 40_20_15_7_ ()
40_20_15_7_ ()
Participants trials
Docs
66
Trial 1
Trial 2
Trial 3
log files containing series of States
State 1
State 2
57000 States 1151 log files
States
67
Current theories of expertise

Constraint Attunement Hypothesis (CAH)
Vicente and Wang (1998)

Long Term Working Memory (LTWM)
Ericsson and Kintsch (1995)

EPAM IV
(e.g., Gobet, Richman, Staszewski and Simon,
1997)

68
Current theories of expertise

Constraint Attunement Hypothesis (CAH)
Vicente and Wang (1998)

Long Term Working Memory (LTWM)
Ericsson and Kintsch (1995)

EPAM IV
(e.g., Gobet, Richman, Staszewski and Simon,
1997)

PRODUCT THEORY
PROCESS THEORIES
69

Ebbinghauss approach manipulating previous
knowledge by constancy (0). Random assignment of
participants to groups.
Chase and Simon approach (expert novice),
manipulating previous knowledge by pre -
selecting participants (no random assignment of
participants to groups)
Move complexity to the lab, and manipulating
previous knowledge by constancy ( Exact amount
of practice and experience for all participants).

70
LTWM (Ericsson and Kintsch, 1995)

STM accounts for working memory in unfamiliar
activities but does not appear to provide
sufficient storage capacity for working memory in
skilled complex activities (p.220)
LTWM is acquired in particular domains to meet
specific demands imposed by a given activity on
storage and retrieval. LTWM is task specific.

71
LTWM (Ericsson and Kintsch, 1995)

Intense practice in a domain creates retrieval
structures associations between the current
context and some parts of LTM that can be
retrieved almost immediately without effort
(example SF and digits).
LTWM permits rapid and reliable reinstantiation
of a context after interruption without a
decrease in performance.

72
LTWM (Ericsson and Kintsch, 1995)

LTWM theory proposes that LTWM is generated
dynamically by the cues that are present in short
term memory.
During text comprehension, where the average
human adult is an expert, retrieval structures
are retrieving propositions from LTM and merging
them with the ones derived from text.

73
CAH (Vicente and Wang, 1998)

Contrary to what process theories maintain,
Constrain Attunement Hypothesis (CAH) does not
commit to a particular psychological mechanism to
explain the phenomenon of expertise.
How should one represent the constrains that the
environment (i.e., the problem domain) places on
expertise?
Under what conditions will there be an expertise
advantage?
What factors determine how large the advantage
can be?

74
CAH (Vicente and Wang, 1998)

Describing the constraints in the environment is
the task of an expertise theory.

75
CAH (Vicente and Wang, 1998) the Abstraction
Hierarchy
Win the game
PURPOSE
Score at least 2 runs in this inning
STRATEGIES
TACTICS
Advance all by one base
Alternative tactics to achieve strategy above
FUNCTIONS
Hit
Run
Run
PLAYERS
Batter
1st base runner
2nd base runner
76
CAH (Vicente and Wang, 1998) the Abstraction
Hierarchy
Overall system goals (how much water each
reservoir is outputting, and at which temperature)
FUNCTIONAL
'D1','D2','T1','T2'
conservation of mass and energy for each
reservoir (how much mass energy is entering and
leaving the reservoir).
'MI1', 'MO1', 'EI1', 'EO1', 'M1', 'E1',
ABSTRACT
'FA','FA1','FA2','HTR1
GENERALIZED
Flows and storage of heat
PHYSICAL
Settings of valves, pumps, and heaters
'PA','PB','VA','VA1','VA2,
Continuum of abstraction, means- ends
relationship between levels
77
CAH (Vicente and Wang, 1998) the Abstraction
Hierarchy
Overall system goals (how much water each
reservoir is outputting, and at which temperature)
FUNCTIONAL
'D1','D2','T1','T2'
conservation of mass and energy for each
reservoir (how much mass energy is entering and
leaving the reservoir).
'MI1', 'MO1', 'EI1', 'EO1', 'M1', 'E1',
ABSTRACT
'FA','FA1','FA2','HTR1
GENERALIZED
Flows and storage of heat
PHYSICAL
Settings of valves, pumps, and heaters
'PA','PB','VA','VA1','VA2,
78
LTWM vs. CAH

LTWM claims that the magnitude of expertise
effects is related to the level of attained
skill and to the amount of relevant prior
experience
CAH argues that this claim is incomplete.
Expertise effects in memory recall are also
determined by the amount of structure in the
domain (and by active attunement to that
structure)
LPSA is sensible both to relevant previous
practice and to amount of structure in the
domain

79
Design and predictions
80
3/4
1/4
?
81
3/4
1/4
?
82
(No Transcript)
83
Predictions

Only huge amounts of experience with the system
would enable the actor (human or model) to make
accurate predictions of the last quarter of the
trial
Sparse practice should clearly lead to poor
prediction
Only structured environments should show the
expertise advantage. Following CAH, the expert
(human or model) should not do well in a
completely unstructured environment

84
Results
85
Three years of experience with DURESS
Average cosine between the fourth quarter of a
target trial and the fourth quarter of the 10
nearest Neighbors When the three first quarters
are used to retrieve the neighbors
86
Six months of experience with DURESS
Average cosine between the fourth quarter of a
target trial and the fourth quarter of the 10
nearest Neighbors When the three first quarters
are used to retrieve the neighbors
87
Three year of experience in a DURESS with no
constraints (random states)
Average cosine between the fourth quarter of a
target trial and the fourth quarter of the 10
nearest Neighbors When the three first quarters
are used to retrieve the neighbors
88
Conclusions
89
conclusions

In LTWMs original formulation the retrieval
structures were under-specified. In LPSA, the
basic mechanisms postulated are defined
computationally.
In CAHs original formulation, the representation
of the environmental constraints (its most
central assertion) where under-specified too.
LPSA proposes an automatic mechanism to represent
the statistical regularities of the environment.

90
conclusions

LPSA can explain both LTWM and CAH main
assertions
LTWM claims that the magnitude of expertise
effects is related to the level of attained skill
and to the amount of relevant prior experience
CAH claims that expertise effects in memory
recall are also determined by the amount of
structure in the domain (and by active attunement
to that structure)
Better yet, LPSA proposes both processes and
representational structures

91
conclusions

What does this mean for theorizing about problem
solving?
As in LTWM for text comprehension, we propose
that in expert problem solving the current
context automatically and effortless retrieve
past knowledge, and adapt it to the current
situation.
This retrieval is specific to the domain of
expertise, and requires a long period of
practice. Short period will not do.
This retrieval is only possible in domains that
show constrains that the expert can use (attune).

92
conclusions

GENERALITY the fact that the same mechanism,
with the very same underlying assumptions, can be
used for language and Problem Solving is
interesting per-se In LTWM, the retrieval
structures for chess are different compared to
the ones proposed for text comprehension In CAH,
two AH for two different tasks are different too
In LPSA, any space for any task is a vector
space.

93
Automatic Landing Technique Assessment using
Latent Problem Solving Analysis (LPSA)
94
The problem

There is currently no methodology to
automatically assess landing technique in a
commercial aircraft or a flying simulator.
Instructors are a significant cost for training
and evaluation of pilots, and the use of
instructors also incorporates a subjective
component that may vary from pilot to pilot.
The advantages of automatic landing technique
evaluation are many (1) Reduced cost of the
evaluation. (2) Increased objectivity in the
evaluation. (3) Decrease the influence of the
instructor. (4) Perfect Test-retest reliability.
(5) It is always available and can be triggered
by the trainee at will. (6) The model can rate as
many landings as time enables, etc.

95
A solution Latent Problem Solving Analysis (LPSA)

Latent Problem Solving Analysis (LPSA, Quesada,
Kintsch and Gomez, 2002) is based on Latent
Semantic Analysis (LSA, Landauer and Dumais,
1997) . Instead of using word occurrence
statistics and huge samples of text, LPSA uses a
representative amount of activity in controlling
dynamic systems (actions or states).
Like words, states and actions appear in
particular contexts but not in others. Some
states and actions are interchangeable, being
functional synonyms. Given the right algorithms
and sufficient amounts of logged trials, a
problem space can be derived in a similar way as
semantic spaces are.
In this application of LPSA to landing technique
evaluation, we assume that an expert uses her
past knowledge to emit landing ratings by
comparing the current situation to the past ones,
and generates an expanded representation of the
environment by composing the past situations that
are most similar to the current one.

96
(No Transcript)
97
Complex, dynamic tasks are intractable when
considered as a whole
98
Complex, dynamic tasks are intractable when
considered as a whole

We need to perform complexity reduction, in a
mostly automatic way
The triangulation technique
Dimensionality reduction (LPSA)

99
The triangulation technique
100
Complexity reduction (I) variable selection
using differently informed experts
101
Criteria used by the experts Levels Levels Levels Levels Levels
Flare Initiation altitude Too high Correct Too low
Thrust Reduction Too fast Correct Too slow
Pitch Angle All the way too high Partly too high Correct Partly too low All the way too low
Overall Landing Score 1 2 3 4 5
102
Complexity reduction (II) Using SVD, the
problem space is a vector space

A state is a string of text consisting of the
values of each variable (reduced information
experts) joined by underscores, to make it a
single token, like
time tag_vertical acceleration_Radio
altitude_Thurst
A landing is a collection of these states. The
variables were sampled ten times per second, and
the landing time was 15 seconds approximately, so
each landing contained about 150 states

103
Complexity reduction (II) Using SVD, the
problem space is a vector space
104
(No Transcript)
105
Results
106
Results
107
Results

The two landing raters agreement is not too
high however, it is similar to other experts
agreement, such as Clinical Psychologists (0.40),
Stockbrokers (lt0.32), Polygraphers (0.32) and
Livestock Judges (0.50). Their agreement is
lower than the ones reported for Weather
Forecasters (0.95), Pathologists (0.55), Auditors
(0.76) and Grain Inspectors (0.60) (Shanteau.
2001).
The correlation between the model, and the
reduced information expert is about the same as
the correlation between the two humans (0.48 vs.
0.46). Note that the ceiling for the model is the
correlation between two humans doing the task a
model that correlates with one human better than
the between-human correlation is under suspicion.
The correlation for the complete information
expert was .39, even though the model was not
trained to mimic him.

108
Results
109
Results

Note that the only criterion where the model
correlates with any of the experts more than they
correlate to each other is thrust reduction.
Thrust reduction seems to be a very difficult
feature to judge, since the agreement between
human experts is the lowest (0.27) and also it is
the one in which the reduced information expert
obtains the lowest test-retest reliability
(0.538, see Table 1 4 in page 119).
All the polychoric correlations between the
reduced information expert and the model were
significant (p .002), so were the correlations
between complete information expert and model.
The equivalent model without dimensionality
reduction (400 dimensions, 5 neighbors, no
weighting, no timestamp) produced correlation
values of 0.37, 0.08, 0.57 and 0.50 for the above
used criteria respectively.

110
Results no-constraints corpus
111
Conclusions

Previously LPSA has been proved as a powerful
theory to model behavior in complex, dynamic
problem-solving tasks, and has been proposed as a
theory of expertise, see Quesada (unpublished).
However, this is the first time that LPSA is used
to develop technology that can be used in
industrial applications.
In previous work, we have presented an
experience-based approach to problem solving.
Problem solving is viewed as the extraction of
useful representations from a corpus of
situations. The creation of the representations
is a primarily bottom-up, unsupervised process.
It is proposed that the problem space can be
viewed as a vector space. People use their past
knowledge to perform complex, dynamic tasks by
comparing the current situation to past ones, and
generate an expanded representation of the
environment by composing the past situations that
are most similar to the current one. In complex
dynamic situations, this intuitive, pattern-based
system can have a very important role.

112
Conclusions

It is possible to construct systems that grade
landing technique automatically as well as
humans, if we consider that the limit of
performance for such a model is the human-human
agreement. The correlation human-human was low
(0.46) but within the range of some other areas
reported (Shanteau, 2001). In a large-scale
application of the model (for a training and
evaluation department, for example), we can
imagine that 500 pilots need to be evaluated. In
that situation, only a small proportion of
randomly sampled landings (that can be kept from
previous sessions) must be evaluated by humans
the rest is performed by the system. Since the
model has different landing criteria, it could
emit recommendations such as In this landing,
you initiated the flare too high, and reduced the
thrust too late. Keep that in your mind for the
next one.

113
Conclusions

A direct consequence of the availability of a
system like LPSA for the development of
psychological theory is that some experiments
that were prohibitive before could now be planned
within the budget. Since instructors are a sparse
resource, an experimenter may decide that she
cannot afford to run a particular, very promising
experiment, because of the expenses associated
with performance assessment. With an automatic
and reliable method to perform the evaluation,
more complex experiments could be feasible.