Title: Engineering Psychology PSY 378F
1Engineering PsychologyPSY 378F
- University of Toronto
- Fall 2002
- L12 Memory and Training
2Outline Lecture 1
- Working Memory
- Short-Term vs. Long-Term Memory
- evidence
- Baddeleys Working Memory Model
- Evidence
- WM Codes and Modalities
3Outline Lecture 2
- Properties of Working Memory
- Duration, Capacity
- Items and Chunks
- Expertise
- RI and PI
- Running Memory Task
- Knowledge in the World
- ltBREAKgt
4Outline Lecture 3
- Long-Term Memory and Training
- Levels of Processing
- Skill Acquisition
- Training Methods
- Transfer of TrainingMethods
- Negative Transfer
- ltGO HOME!gt
5STM versus LTM
- We all know that we can know something, and then
later forget it - Often this is labeled short-term vs. long-term
memory (STM vs. LTM) - What is experimental evidence for STM/LTM
distinction? - Results from serial position experiments
- We give people a list of 20 words, one at a time
at a certain rate, say 1 every 2 s (Dog, Shovel,
Run, Ripple, ) - We do this again and again, maybe 10 times
6Serial Position Curve
- Plot P(Recall) vs. serial position--position in
the list
Primacy Effect
Recency Effect
P(Recall)
1
20
17
Serial Position
7Effect of Interference
- Atkinson Shiffrin varied this procedure a
number of different ways - Participants perform arithmetic task before
recalling list - Recency effect reduced or eliminated (if time
spent on arithmetic task increased) - Early part of curve (including primacy effect)
unaffected
Recency Effect
P(Recall)
Serial Position
8Effect of Presentation Rate
Primacy Effect
- At two different presentation rates (1 s 2 s
per item)--affects primacy only - Different manipulations affect different parts of
the curve - Suggests different mechanisms produce different
parts of the curve - Justifies distinction between STS and LTS
P(Recall)
2 s
1 s
Serial Position
STS
LTS
P(Recall)
Serial Position
9The Modal Model
- Two stores short-term (STS, STM, working memory)
- And long term (LTS or LTM)
10Working Memory
- Baddeley
- Working memory is version of activity in STS
- Differs from STS in two ways
- 1) has a more functional emphasiswhat is STM
for? - 2) different components in STM
11Working Memory
- Three components to working memory
- visuo-spatial sketchpad--maintenance and activity
in visual-spatial domain (e.g., imagery such as
mental rotation) - central executive--contains resources that can be
siphoned off to WM subsystems - phonological store--verbal rehearsal--language
based short-term storage and rehearsal
Visuospatial sketchpad
Central Executive
Articulatory Loop
Phonological Store
12Working Memory
- What is evidence for 3 components?
- Interference occurs when two tasks (or task
components) draw upon same working memory (WM)
subsystem - Performance degrades relative to situation where
different WM subsystems involved
13Working Memory
Visuospatial sketchpad
Task 1
Task 1
Good Time Sharing
INTERFERENCE!
Central Executive
Task 3
Task 2
Phonological Store
14Working Memory
- An example is the experiment by Brooks (1968)
Task
Verbal
Spatial
Verbal
Response Method
Spatial
15Brooks Experiment
16Brooks Experiment Results
- Spatial task (Big F task) better performed with
verbal responses - Verbal task (quick brown fox nouns and verbs)
better performed with spatial responses - Why? Task and response method draw upon different
WM components in these cases
17Implications
- Implications
- The two subsystems of working memory are
functionally independent susceptible to
interference from different types of activities - Tasks should be designed such that disruption
does not occur
18Implications (contd)
- Tasks that impose high loads on the visuo-spatial
system--the sketch pad (e.g., air traffic
control) should not be performed concurrently
with other tasks that will also use this system - Use the auditory-phonetic system (the
phonological storearticulatory loop) instead - On the other hand,
- Tasks involving heavy demands on the auditory
phonetic system (e.g., editing text, computing
numbers) will be more disrupted by concurrent
voice input/output than by visual manual
interaction (control with a mouse)
19Kinesthetic WM
- Also evidence for kinesthetic working memory
- Separate from visuospatial (Woodin Heil, 1996)
- Used experienced rowers as participants
- Tapping own body interfered with memory for
rowing positions, but not positions in 4 x 4
matrix - Implications for sports training and performance
Visuospatial sketchpad
Kinesthetic Output Component
Central Executive
Phonological Store
20Codes and Modalities
- Is there an optimum matching between stimulus
modality and working memory codes? Yes - Although it is possible to employ
auditory-spatial displays for spatial tasks,
usually less effective than visual displays. - Auditory modality less effective at processing
spatial information - Tasks that demand verbal working memory better
served by speech displays than by print (if not
much verbal information to communicate). - Auditory modality more effective at processing
language information
21Codes and Modalities
22Longer Communication
- With longer messages, both auditory and visual
channels are likely to show failures of memory - But with print, can physically prolong the
message--makes it more effective for long
messages - Might want to code redundantly (use both auditory
and visual displays) if it does not cause too
much interference with other tasks
23Quiz
24Break
25Duration of Working Memory
- BrownPeterson paradigm
- Participant presented with auditory sequence of
letters - Try to remember them while performing interfering
task (counting backwards by 3s)
26Duration of Working Memory
27Duration of Spatial WM
- Loftus et al. (1979)
- Subjects tried to remember navigational info.
similar to that delivered by air-traffic
controllers. - Moray (1986)
- Subjects were radar controllers trying to recall
info. that had been displayed on a radar scope. - Both researchers found same types of forgetting
functions - Essentially can clear out spatial WM memory in 18
s or so - So, transience also applicable to spatial WM
- Transience occurs both in visuospatial sketch pad
AND in phonological store
28Duration Affected by Number of Items
- Curves a and c represent 1 and 5 item (letter)
sequences - Faster decay observed with more items
- limiting case--curve d memory span-- 7 items
29WM Explains Word Length Effect
- Component of phonological store is articulatory
loop - With more items to be rehearsed, there will be a
longer delay between successive rehearsals of
each item - In fact, the length of items--how long the items
take to say--decreases the capacity of working
memory--so speed of rehearsal makes a difference
30But What is an Item?
- We talked about an item being a letter
- In absolute judgment task, items were things like
different line lengths - Couldnt a word be an item?
- Lets try Brown-Peterson task again--with words
this time - DOG CAT BOY
- Pack in more information--three three-letter
words contains nine letters - Now we have 3X as much information being held
31Chunking Revisited
- Miller addressed this question by proposing the
concept of chunk - chunking is grouping together items based on
their meaning - and so a chunk is that group
- e.g., b-i-f could be reorganized to FBI--now we
have a chunk - working memory capacity is 7 plus or minus 2
chunks of information - chunk can be letter, word, sentence
32Chunking Revisited
- Components of a chunk need to be semantically
tied together, typically through assn in LTM - chunking can occur at higher levels as
well--e.g., sentences - London is the largest city in England (7
words)--but maybe could associate the words
together into a meaningful whole--into a
superchunk - New York is the largest city in the United
States - Toronto is the largest city in Canada
33Chunking Revisited
- Should avoid having people perform tasks
requiring working at 7 plus or minus 2 limit - One of the best ways to avoid capacity and decay
limitations of working memory is to facilitate
chunking whenever possible - People with large working memory capacities
typically have system for chunking numbers or
letters so that they are meaningful (e.g., dates
or ages), or by combining them hierarchically to
form superchunks - Ss with normal memory spans can get up to 80
digits or so, using various chunking techniques - Expertise plays a role herelong-term WM
(Ericsson Kintsch)info in LTWM is stable, but
accessed through temporarily active retrieval
cues in WM
34Chess
- Analogous to memory for chess position by masters
and novices (Chase and Simon) - If board position was taken from the progression
of a reasonable game, experts recalled better
than novices - If board position random, no difference between
the two groups
35Pilot Communication
- Barnett (1989) found similar results with novice
and expert pilots for communication exchange - When exchanges flowed in the normal sequence,
experts performed better, but no difference if
exchanges in random sequence. - Chunking--resulting in improved memory
capacity--is a byproduct of training
36English Experts
- Were all fluent in Englishall highly trained,
experts - Designers should be able to capitalize on
language familiarity - Coding--codes can be developed so as to
facilitate chunking - license plate codes--vanity plates are more
memorable e.g., FUN2GO - commercial phone numbers (967-1111)
- radio station codes (CHUM-FM, The Edge)
37RI and PI
- Information can be lost from working memory
through active interference from other
information - Retroactive Interferencematerial learned after
material to be recalled (MTBR) affects recall of
MTBR - Proactive Interference material learned before
MTBR affects recall of MTBR
38RI and PI
- Not just a laboratory phenomemon
- Loftus study with air-traffic controllers
- At least 10 s delay necessary before material
remembered in previous exchange did not disrupt
memory for a subsequent exchange
39Running Memory Task
- In the running memory task, a sequence of items
(e.g., letters, numbers) is presented to the
operator, and the operator has to identify the
item K items ago
40Running Memory Task
- Operator does not know how long the string is.
- Operator is not expected to remember the entire
string. - As each item comes in, the operator is expected
to do something with it (categorize it, check its
value, etc.) - So, if operator asked to recall last few items,
typically cant remember much more than n-2,
where n is the most recent item. (performance
falls off rapidly if K gt 2)
41Yntema (1963) results
- Results
- 1) Ss performed better with a few objects, many
attributes than with many objects, few
attributes--integration/chunking effect - 2) Perf. better if each attribute has its own
scale
42Yntema (1963) Recommendations
- From Result 1 Assign each operator to monitor
all attributes of a few objects - From Result 2 Dont code spatial variables with
same units--e.g., distance from flight tower
(feet) and altitude (feet). - In addition to air-traffic control, results may
be applicable to other domains where information
isnt continuously shown (e.g., taxicab
dispatcher)
43Putting Memory in the World
- Knowledge is not all in the head--it is partially
in the world, and in the constraints of the world
(Norman, Design of Everyday Things) -
- Result precise behavior can result from
imprecise knowledge for four reasons - 1) Information is in the world
- 2) Great precision (of knowledge) not required
- 3) Natural constraints are present
- 4) Cultural constraints are present
441) Information is in the World
45Information is in the World
- The info you code in memory need only be precise
enought to sustain the quality of behavior you
desire - Whenever information needed to do a task is
readily available, the need for us to learn it
diminishes - Examples
- penny
- hunt-and-peck typists
- I can take you there, but I cant tell you how
to get there
462) Great Precision (of Knowledge) not Required
- Dont need all information in head
- Can distinguish quarter from nickel, although may
not be able to tell you who is on each coin, or
the words on the coins - But if you make more precise memory necessary you
will have a problem
47Great Precision (of Knowledge) not Required
- US Susan B. Anthony one-dollar coin--confusable
with quarter - Britain one-pound coin--confusable with five
pence piece - France 10-franc coin confusable with half-franc
coin - Descriptions formed to distinguish among the old
coins were not precise enough to distinguish
between the new one and one of the old ones
48My Red Notebook
- I buy a small red notebook
- Call it my notebook
- Then get another notebook--a blue one
- Call first notebook my red notebook
- Then get a large red notebook
- Call first notebook my small red notebook
- Point is that my mental representation need only
discriminate among the choices in front of me - But add another choice and I have to change my
representation
493) Natural Constraints Are Present
- Often an objects physical features limit how it
can be used - Cant use a shovel to brush teeth
504) Cultural Constraints Are Present
- Society has evolved many conventions that govern
acceptable social behavior - This lets us know what to do in unfamiliar
circumstances - What is appropriate behavior at a party, or in a
restaurant - What is the sequence of events in a restaurant?
- If we have to wait for something to happen (like
the waitress to come and take our order, some of
us get fidgety)
51Tradeoff between Knowledge in the World and in
the Head
- We need both knowledge in the world and in the
head - But in certain situations we choose to rely more
on one than the other - Gaining the advantages of knowledge in the world
means losing the advantages of knowledge in the
head.
52Tradeoff Examples
- Providing a visual echo of a message a pilot
receives from air-traffic control - or a continuous record of location in a
hierarchical computer database - or providing CDTIs (cockpit displays of traffic
info.) - All these examples are putting information in the
world - But causes visual clutter, might disrupt
performance of pilot or user - With CDTIs, may increase the visual workload--is
the increase worth the benefits? - Memory aids (information in the world) a mixed
blessing -
53Tradeoff between Knowledge in the World and in
the Head
From Norman (1992), Design of Everyday Things
54Break
55Quiz 2
56Part 3
- Score each recalled word as Case, Rhyme, Semantic
for Y and N answers separately - Count up the number in each category
Case
Rhyme
Semantic
57Levels of Processing
- More deeply you process something, the better the
chance that you will remember it - That is, that you will transfer the info to LTS
from STS - Deeper approx. equal to more meaningful
- Process view of memory
P(Recall)
Case
Rhyme
Semantic
Level of Processing
58Levels of Processing Another Take
- Normans taxonomy of memory
- 1) Memory for arbitrary things
- 2) Memory for meaningful relationships
- 3) Memory thru explanation
P(Recall)
Relationship
Arbitrary
Explanation
Level of Processing
591) Memory for Arbitrary Things
- Items to be remembered are arbitrary
- No particular relationship to each other or to
anything else - Storage of arbitrary codes
- Requires rote learning--like learning the
alphabet - Rote learning creates problems
- It is difficult, can take considerable time and
effort - When problems arise, memorized sequence gives no
hint as to what has gone wrong - No suggestion of what you might do to fix the
problem
602) Memory for Meaningful Relationships
- Can relate what we learn to knowledge that we
already have - New material can be understood, interpreted,
integrated, with previously acquired material - e.g., Mr Tanakas L/R turn signals on
handlebars - Now much easier to interpret and remember
- Although doesnt really explain anything
- Cant be used for future prediction
613) Memory Thru Explanation
- Material can be derived from some explanatory
mechanism - Mental model scan play a role here
- Details can be derived when need, such as in
unexpected situations - People often make up mental models for many
things that they do - This is why designers should provide users with
appropriate models - When not supplied, people will make them up
(e.g., impetus model) - Power of a mental model is that it allows you to
predict
62Long-Term Memory and Training
- The HF practitioner is often faced with the
problem of developing the most efficient training
program--the greatest level of proficiency per
dollar invested. - Different forms of training are necessary for
mastery of declarative knowledge vs. procedural
knowledge
63Declarative vs. Procedural Knowledge
- Declarative Knowledge--facts about a domain, we
can verbalize these, or write them down (e.g.,
knowledge in typical university course) - better off with study and rehearsal
- Levels of Processing (both kinds) important here
- Procedural Knowledge--how to do something, often
not easily verbalized (e.g., riding a bike,
driving a car, using a lathe) - Tell someone everything you know about riding a
bike, but it wont help that much - better off with practice and performance
- Were going to focus on training the second kind
of knowledge (procedural)
64Skill Acquisition
- Practice makes perfect
- Most skills continue to improve for weeks,
months, even years! - Can obtain errorless performance in many tasks
quite quickly - But two other performance measures continue to
improve speed (RT), attention or resource demand
(as measured by performing a concurrent task)
65Still Improving After Millionth Cigar
663 Stages of Skill Acquisition
- Anderson (1982)
- i) cognitive stage
- learner often works from instructions, or an
example - learner rehearses instructions, e.g., driving
std, press clutch down first - ii) associative stage
- go from declarative rep. to procedural rep.
- Performance becomes more fluid and error free
- Verbalization goes
- iii) autonomous stage
- skill becomes more automated and rapid--less
conscious - person loses ability to verbally describe the
scale - performance overlearned
67Production Rules
- Anderson talks about production rules (if-THEN)
- e.g., if high RPM and in first gear, THEN switch
to second gear - Says they are the key structure unifying course
of skill acquisition - Development of skill in associative stage can be
decomposed into many component production rules - Motor program is THEN part of production rule
its learning in the autonomous stage is the
fine-tuning of the production rule. - To get automaticity, stimuli or rules must be
consistently mapped to a response
68Guided Training
- Practice makes permanent
- Training that allows errors to be made trial
after trial will become detrimental, b/c errors
become learned - Guided training ensures that learners
performance never strays far from what task
requires
69Training Wheels
- Error prevention often accomplished by guided
training such as the training wheels idea
developed by Carroll. - With training wheels, users are prevented from
straying off the beaten path--making typical
mistakes that can result in wasted time - Instead of allowing the error to affect the
system, training wheels simply informs the
user/learner of the nature of error, and allows
the user to continue on - Good evidence to support this approach
70Augmented Feedback
- Error prevention can also be accomplished by
using augmented feedback techniques - For learning to fly, such feedback might paint
an ideal flight path through the sky to the
runway - Learner tracks path to achieve proper landing
approach-- ingrains the correct sequence of
responding - Does help to produce rapid learning of the skill
71Problem for Guided Training
- Whats the problem with training wheels/augmented
feedback? - Problem--it often leads to poor transfer in a
more realistic environment. - Sometimes making errors leads to learning
- Need happy mediumeliminate sources of error that
change the task or waste training time - But keeping those sources of error intrinsic to
task
72Adaptive Training
- Imagine learning to play piano
- Some component of the task made simpler to reduce
the initial level of difficulty. - Then, as training proceeds, this component
gradually increases in difficulty until level of
target task is reached. -
73Evaluation of Adaptive Training
- Reviews mixed on this technique
- Simplification does make it easier to perform the
consistent elements of the task - However, the easy versions of the task may induce
a response strategy incompatible with one
necessary to perform the final task - Time stress is effective in adaptive training,
however - increase the time stress (speed at which events
occur) as approach the final task
74Part-Task Training
- Elements of complex task learned separately
- Wightman Lintern (1985)--distinguished between
two different forms - segmentation and fractionization
75Segmentation
- Segmentation defines situation where different
sequential phases of the skill are practiced
before being integrated - e.g., train up on difficult passage alone, then
play easy passage once, then play them together - research shows this is useful--not wasting time
on easy stuff--efficient
76Fractionization
- Practice on components of a task separately
(e.g., LH, RH on piano) that you eventually
perform concurrently - Merits less clear cut--prevents development of
time-sharing skills--may be necessary to link and
co-ordinate the two activities - If you are very careful in selecting components
of tasks that can be easily broken off, and what
must be practiced together fractionization
training can be effective
77Varied Priority Training
- (Gopher, Weil, Siegel)--effective
- Perform everything together, but attend to one
component and de-emphasize any others - Integrality of task not destroyed, and yet since
only small attention is paid to the lower
priority component, it does not distract from the
main component
78Transfer of Training
- Can learning a new skill, or a skill in a new
environment, capitalize on what has been learned
before? - e.g.,
- Learning Excel then Access
- Training in a flight simulator before training in
a plane - Training course before on-the-job training
79Transfer of Training
- How do we measure it?
- Control group took 10 hr. to reach criterion
- Transfer group took 8 hr. to reach criterion
- Savings ctrl time - transfer time
- 10 8 2 hours
- Transfer savings control time
- 2/10 20
80Transfer Effectiveness Ratio (TER)
- But wait a minute
- Control Group spent 10 hours training,
- Training Group 2 spent 12 hours training, 4 hours
in simulator, 8 hours in real task - The Transfer Effectiveness Ratio (TER) expresses
this relative efficiency - TER savings training period
- 2/4 .50
81(No Transcript)
82Transfer Effectiveness Ratio (TER)
- If TER gt 1, training for transfer group more
efficient than for ctrl grp - If TER lt 1, opp. is true
- If TER lt 0, your training program is worthless
- If 0 lt TER lt 1, this does not mean training is
worthless, for two reasons - 1) training program may be safer
- 2) may be less expensive
83Training Cost Ratio (TCR)
- Training Cost Ratio (TCR) reflects the cost
component - TCR training cost in task environment per unit
time / training cost in training program per unit
time - Cheaper the training device, the lower your
allowable TER can be (everything else held
constant) - If TER ? TCR gt 1, program is cost effective,
otherwise not - Even if program not cost effective, important to
consider safety issues
84Diminishing Returns
- There is a diminishing effectiveness of most
training devices (as measured by the TER) with
increased training time - i.e., TERs decrease with time in training
- Amount of training at which TER ? TCR 1 is
point beyond which the training program is no
longer cost effective
85Picking the App to Train
- Large TCR indicates potential for simulation
training - e.g.s importance of relative cost of training
program vs. training in environment - Ship navigation handling
86Types of Transfer
- ve transfer--training program and target task
are highly similar - 0 transfer--extreme differences between program
and task - -ve transfer--similar in some respects, different
in others, leading to improper expectations
87Training System Fidelity
- Should training simulators resemble the real
world as much as possible? NO. - Why?
- 1) realistic simulators are expensive--added
realism may add little to TER, but affects TCR - e.g., plants in office situation
- 2) if similarity does not achieve complete
identity, may lead to negative transfer - e.g., motion does not help in flight
simulators--isnt realistic - 3) if high realism leads to high task complexity,
may divert attention from critical skill to be
learned - hard to learn to drive a manual transmission in
big city traffic
88Capture Important Task Components
- Instead of total fidelity, need to understand
which components of the target task should be
preserved in the simulator, in the training
situation. - e.g., sequence of steps that user has to perform
89Gibsons Invariants in Simulators
- Evidence for usefulness of including perceptual
invariants - e.g., global optical flow in flight simulator
- Optical flow in driving simulator--heading of
vehicle relative to vanishing point - Something that those designing virtual reality
systems should remember - Sense of immersion does not require extremely
high fidelity--the task related invariants are
what is necessary
90Negative Transfer
- When two situations have similar stimulus
elements but different response or strategic
components, transfer will be negative - Reverse position of gears for stickshift
- Displays, sound of engine these
characteristics?will remain the same - This is especially true if the new and old
response are opposites
91Negative Transfer
Stimulus Elements
Different
Same
Same
Response Elements
0
Different
92Negative Transfer
- Negative transfer can be a concern for an
operator who has to switch between two systems - e.g.,
- Truck driver with two different gear arrangements
- switching from Microsoft Word to a DOS
application (TurboC)--grabbing for the mouse - MacIntosh, Windows consistent interface standard
- number of aircraft a pilot can fly without going
through special training - Two systems can be different in their display
characteristics but can involve positive transfer - If there is identity in response elements--e.g.,
same ctrl movements for driving, different
dashboard displays--get high transfer
93End