Title: Normative models of human inductive inference
1Normative models of human inductive inference
- Tom Griffiths
- Department of Psychology
- Cognitive Science Program
- University of California, Berkeley
2Perception is optimal
Körding Wolpert (2004)
3Cognition is not
4Optimality and cognition
- Can optimal solutions to computational problems
shed light on human cognition?
5Optimality and cognition
- Can optimal solutions to computational problems
shed light on human cognition? - Can we explain aspects of cognition as the result
of sensitivity to natural statistics? - What kind of representations are extracted from
those statistics?
6Optimality and cognition
- Can optimal solutions to computational problems
shed light on human cognition? - Can we explain aspects of cognition as the result
of sensitivity to natural statistics? - What kind of representations are extracted from
those statistics?
Joint work with Josh Tenenbaum
7Natural statistics
Neural representation
Images of natural scenes
sparse coding
(Olshausen Field, 1996)
8Predicting the future
- How often is Google News updated?
- t time since last update
- ttotal time between updates
- What should we guess for ttotal given t?
9Reverend Thomas Bayes
10Bayes theorem
h hypothesis d data
11Bayes theorem
h hypothesis d data
12Bayesian inference
- p(ttotalt) ? p(tttotal) p(ttotal)
posterior probability
likelihood
prior
13Bayesian inference
- p(ttotalt) ? p(tttotal) p(ttotal)
- p(ttotalt) ? 1/ttotal p(ttotal)
posterior probability
likelihood
prior
assume random sample (0 lt t lt ttotal)
14The effects of priors
15Evaluating human predictions
- Different domains with different priors
- a movie has made 60 million power-law
- your friend quotes from line 17 of a poem
power-law - you meet a 78 year old man Gaussian
- a movie has been running for 55 minutes
Gaussian - a U.S. congressman has served for 11 years
Erlang - Prior distributions derived from actual data
- Use 5 values of t for each
- People predict ttotal
16people
empirical prior
parametric prior
Gotts rule
17Predicting the future
- People produce accurate predictions for the
duration and extent of everyday events - People are sensitive to the statistics of their
environment in making these predictions - form of the prior (power-law or exponential)
- distribution given that form (parameters)
18Optimality and cognition
- Can optimal solutions to computational problems
shed light on human cognition? - Can we explain aspects of cognition as the result
of sensitivity to natural statistics? - What kind of representations are extracted from
those statistics?
Joint work with Adam Sanborn
19Categories are central to cognition
20Sampling from categories
Frog distribution P(xc)
21Markov chain Monte Carlo
- Sample from a target distribution P(x) by
constructing Markov chain for which P(x) is the
stationary distribution - Markov chain converges to its stationary
distribution, providing outcomes that can be used
similarly to samples
22Metropolis-Hastings algorithm(Metropolis et al.,
1953 Hastings, 1970)
- Step 1 propose a state (we assume
symmetrically) - Q(x(t1)x(t)) Q(x(t))x(t1))
- Step 2 decide whether to accept, with
probability -
Metropolis acceptance function
Barker acceptance function
23Metropolis-Hastings algorithm
p(x)
24Metropolis-Hastings algorithm
p(x)
25Metropolis-Hastings algorithm
p(x)
26Metropolis-Hastings algorithm
p(x)
A(x(t), x(t1)) 0.5
27Metropolis-Hastings algorithm
p(x)
28Metropolis-Hastings algorithm
p(x)
A(x(t), x(t1)) 1
29A task
- Ask subjects which of two alternatives comes
from a target category
Which animal is a frog?
30A Bayesian analysis of the task
Assume
31Response probabilities
- If people probability match to the posterior,
response probability is equivalent to the Barker
acceptance function for target distribution p(xc)
32Collecting the samples
Which is the frog?
Trial 1
Trial 2
Trial 3
33Verifying the method
34Training
- Subjects were shown schematic fish of
different sizes and trained on whether they came
from the ocean (uniform) or a fish farm (Gaussian)
35Between-subject conditions
36Choice task
- Subjects judged which of the two fish came
from the fish farm (Gaussian) distribution
37Examples of subject MCMC chains
38Estimates from all subjects
- Estimated means and standard deviations are
significantly different across groups - Estimated means are accurate, but standard
deviation estimates are high - result could be due to perceptual noise or
response gain
39Sampling from natural categories
- Examined distributions for four natural
categories giraffes, horses, cats, and dogs
Presented stimuli with nine-parameter stick
figures (Olman Kersten, 2004)
40Choice task
41Samples from Subject 3(projected onto plane from
LDA)
42Mean animals by subject
S1
S2
S3
S4
S5
S6
S7
S8
giraffe
horse
cat
dog
43Marginal densities (aggregated across subjects)
- Giraffes are distinguished by neck length,
body height and body tilt - Horses are like giraffes, but with shorter
bodies and nearly uniform necks - Cats have longer tails than dogs
44Markov chain Monte Carlo with people
- Normative models can guide the design of
experiments to measure psychological variables - Markov chain Monte Carlo (and other methods) can
be used to sample from subjective probability
distributions - category distributions
- prior distributions
45Conclusion
- Optimal solutions to computational problems can
shed light on human cognition - We can explain aspects of cognition as the result
of sensitivity to natural statistics - We can use optimality to explore representations
extracted from those statistics
46(No Transcript)
47Relative volume of categories
Convex Hull
Minimum Enclosing Hypercube
Convex hull content divided by enclosing
hypercube content
Giraffe Horse Cat Dog
0.00004 0.00006 0.00003 0.00002
48Discrimination method(Olman Kersten, 2004)
49Parameter space for discrimination
- Restricted so that most random draws were
animal-like
50MCMC and discrimination means
51(No Transcript)
52Iterated learning(Kirby, 2001)
Each learner sees data, forms a hypothesis,
produces the data given to the next learner
With Bayesian learners, the distribution over
hypotheses converges to the prior (Griffiths
Kalish, 2005)
53Explaining convergence to the prior
PL(hd)
PL(hd)
PP(dh)
PP(dh)
- Intuitively data acts once, prior many times
- Formally iterated learning with Bayesian agents
is a Gibbs sampler on P(d,h)
(Griffiths Kalish, in press)
54Iterated function learning(Kalish, Griffiths,
Lewandowsky, in press)
- Each learner sees a set of (x,y) pairs
- Makes predictions of y for new x values
- Predictions are data for the next learner
55Function learning experiments
Examine iterated learning with different initial
data
56(No Transcript)