Dynamics of Reward Bias Effects in Perceptual Decision Making - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

Dynamics of Reward Bias Effects in Perceptual Decision Making

Description:

Dynamics of Reward Bias Effects in Perceptual Decision Making Jay McClelland & Juan Gao Building on: Newsome and Rorie Holmes and Feng Usher and McClelland – PowerPoint PPT presentation

Number of Views:95
Avg rating:3.0/5.0
Slides: 23
Provided by: psychSta
Category:

less

Transcript and Presenter's Notes

Title: Dynamics of Reward Bias Effects in Perceptual Decision Making


1
Dynamics of Reward Bias Effects in Perceptual
Decision Making
  • Jay McClelland Juan Gao
  • Building on
  • Newsome and RorieHolmes and FengUsher and
    McClelland

2
Our Questions
  • Can we trace the effect of reward bias on
    decision making over time?
  • Can we determine what would be the optimal
    policy, and what constraints there are on this
    policy?
  • Can we determine how well participants do at
    achieving optimality?
  • Can we uncover the processing mechanisms that
    lead to the observed patterns of behavior?

3
Overview
  • Experiment
  • Results
  • Optimality analysis
  • Abstract dynamical model
  • Mechanistic dynamical model

4
Human Experiment Examining Reward Bias Effect at
Different Time Points after Target Onset
  • Stimuli are rectangles shifted 1,3, or 5 pixels L
    or R of fixation
  • Reward cue occurs 750 msec before stimulus.
  • Small arrow head visible for 250 msec.
  • Only biased reward conditions (2 vs 1 and 1 vs 2)
    are considered.
  • Response signal occurs at these times after
    stimulus onset
  • 0 75 150 225 300 450 600 900 1200 2000
  • Participant receives reward (one or two points)
    if response occurs within 250 msec of response
    signal and is correct.
  • Participants were run for 15-25 sessions to
    provide stable data.
  • Data shown are from later sets of sessions in
    which the biasing effect of reward appeared to be
    fairly stable.

5
A participant with very little reward bias
  • Top panel shows probability of response giving
    larger reward as a function of actual response
    time for combinations of
  • Stimulus shift (1 3 5) pixels
  • Reward-stimulus compatibility
  • Lower panel shows data transformed to z scores,
    and corresponds to the theoretical construct
    mean(x1(t)-x2(t))bias(t)
    sd(x1(t)-x2(t))
  • where x1 represents the state of the accumulator
    associated with greater reward, x2 the same for
    lesser reward,and S is thought to choose larger
    reward if x1(t)-x2(t)bias(t) gt 0.

6
Participants Showing Reward Bias
7
(No Transcript)
8
Abstract optimality analysis
9
Assumptions
  • At a given time, two distributions, means mu,
    -mu, same STD sigma. Choice? x gt?lt X_c
  • For three difficulty levels, same STD sigma,
    means mu_i (i1,2,3), same X_c.

10
Only one diff level
Three diff levels
Subjects sensitivity, a definition in theory of
signal detectability
When response signal delay varies
For each subject, fit with function
11
Subject Sensitivity
12
(No Transcript)
13
Real bias
Optimal bias
14
(No Transcript)
15
Dynamical analysis
  • Based on one dimensional leaky integrator model.
  • Initial condition x 0
  • Chose left if x gt 0 when the response signal is
    detected otherwise choose right.
  • Accuracy approximates exponential approach to
    asymptote because of leakage.
  • How is the reward implemented?
  • A time-varying offset that optimizes reward?
  • Offset in initial conditions?
  • An additional term in the input to the decision
    variable?
  • A fixed offset in the value of the decision
    variable?

16
1. Time-varying term that optimizes rewards (No
free parameter for reward bias)
  • Notes
  • Equivalent to a time-varying criterion -b(t).
  • There is a dip at
  • Prediction and test higher C level ? earlier
    dip.
  • For multiple C levels, no analytical expressions.

17
2. Offset in initial conditions
  • Notes
  • Effect of the bias decays away for lambdalt0.
  • Single C level , a dip at
  • Prediction and test higher C level ? earlier dip

18
3. Reward as a term in the input
  • Reward signal comes -t seconds relative to
    stimulus.
  • For tlt0 input b noise sd s
  • For tgt0, input baC noise continues as before.
  • Notes
  • Effect of the bias persists.
  • But bias is sub-optimal initially, and there is
    no dip.

19
4. Reward as a constant offset in the decision
variable
  • Note
  • Equivalent to setting criterion at m0
  • Effect persists for lambdalt0.
  • Single C level , a dip at
  • Prediction and test higher C level ? earlier dip

20
5. Reward as a term in the input, creating
variability at stimulus onset
  • Reward signal comes -t seconds relative to
    stimulus.
  • For tlt0 input b, noise sd sb
  • Eor tgt0, input baC noise sd sbs.
  • Notes
  • Effect of the bias persists.
  • If sb 0, no dip.
  • Prediction and testgiven small sb, longer
    reward period ? later and shallower dip.

21
Leaky Competing Integrator Model
Inputs for reward stimulus response
signal High threshold for
22
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com