Analyzing iterated learning - PowerPoint PPT Presentation

About This Presentation

Title:

Analyzing iterated learning

Description:

Studying the cognitive aspects of cultural transmission provides unique ... myths and legends. causal theories. In the lab: functions and categories. Outline ... – PowerPoint PPT presentation

Number of Views:43

Avg rating:3.0/5.0

Slides: 43

Provided by: josht150

Learn more at: https://cocosci.princeton.edu

Category:

more less

Transcript and Presenter's Notes

Title: Analyzing iterated learning

1
Analyzing iterated learning

Tom Griffiths
Brown University

Mike Kalish University of Louisiana
2
Cultural transmission

Most knowledge is based on secondhand data
Some things can only be learned from others
cultural objects transmitted across generations
Studying the cognitive aspects of cultural
transmission provides unique insights

3
Iterated learning(Kirby, 2001)

Each learner sees data, forms a hypothesis,
produces the data given to the next learner
c.f. the playground game telephone

4
Objects of iterated learning

Its not just about languages
In the wild
religious concepts
social norms
myths and legends
causal theories
In the lab
functions and categories

5
Outline

Analyzing iterated learning
Iterated Bayesian learning
Examples
Iterated learning with humans
Conclusions and open questions

6
Outline

Analyzing iterated learning
Iterated Bayesian learning
Examples
Iterated learning with humans
Conclusions and open questions

7
Discrete generations of single learners
PL(hd)
PL(hd)
PP(dh)
PP(dh)
PL(hd) probability of inferring hypothesis h
from data d PP(dh) probability of generating
data d from hypothesis h
8
Markov chains
x
x
x
x
x
x
x
x
Transition matrix T P(x(t1)x(t))

Variables x(t1) independent of history given
x(t)
Converges to a stationary distribution under
easily checked conditions for ergodicity

9
Stationary distributions

Stationary distribution
In matrix form
? is the first eigenvector of the matrix T
Second eigenvalue sets rate of convergence

10
Analyzing iterated learning
11
A Markov chain on hypotheses

Transition probabilities sum out data
Stationary distribution and convergence rate from
eigenvectors and eigenvalues of Q
can be computed numerically for matrices of
reasonable size, and analytically in some cases

12
Infinite populations in continuous time

Language dynamical equation
Neutral model (fj(x) constant)
Stable equilibrium at first eigenvector of Q

(Nowak, Komarova, Niyogi, 2001)
(Komarova Nowak, 2003)
13
Outline

Analyzing iterated learning
Iterated Bayesian learning
Examples
Iterated learning with humans
Conclusions and open questions

14
Bayesian inference

Rational procedure for updating beliefs
Foundation of many learning algorithms
(e.g., Mackay, 2003)
Widely used for language learning
(e.g., Charniak, 1993)

Reverend Thomas Bayes
15
Bayes theorem
h hypothesis d data
16
Iterated Bayesian learning
17
Markov chains on h and d

Markov chain on h has stationary distribution
Markov chain on d has stationary distribution

the prior
the prior predictive distribution
18
Markov chain Monte Carlo

A strategy for sampling from complex probability
distributions
Key idea construct a Markov chain which
converges to a particular distribution
e.g. Metropolis algorithm
e.g. Gibbs sampling

19
Gibbs sampling

For variables x x1, x2, , xn
Draw xi(t1) from P(xix-i)
x-i x1(t1), x2(t1),, xi-1(t1), xi1(t), ,
xn(t)
Converges to P(x1, x2, , xn)

(Geman Geman, 1984)
(a.k.a. the heat bath algorithm in statistical
physics)
20
Gibbs sampling
(MacKay, 2003)
21
Iterated learning is a Gibbs sampler

Iterated Bayesian learning is a sampler for
Implies
(h,d) converges to this distribution
converence rates are known
(Liu, Wong, Kong, 1995)

22
Outline

Analyzing iterated learning
Iterated Bayesian learning
Examples
Iterated learning with humans
Conclusions and open questions

23
An example Gaussians

If we assume
data, d, is a single real number, x
hypotheses, h, are means of a Gaussian, ?
prior, p(?), is Gaussian(?0,?02)
then p(xn1xn) is Gaussian(?n, ?x2 ?n2)

24
An example Gaussians

If we assume
data, d, is a single real number, x
hypotheses, h, are means of a Gaussian, ?
prior, p(?), is Gaussian(?0,?02)
then p(xn1xn) is Gaussian(?n, ?x2 ?n2)
p(xnx0) is Gaussian(?0cnx0, (?x2 ?02)(1 -
c2n))
i.e. geometric convergence to prior

25
An example Gaussians

p(xn1x0) is Gaussian(?0cnx0,(?x2
?02)(1-c2n))

26
?0 0, ?02 1, x0 20 Iterated learning
results in rapid convergence to prior
27
An example Linear regression

Assume
data, d, are pairs of real numbers (x, y)
hypotheses, h, are functions
An example linear regression
hypotheses have slope ? and pass through origin
p(?) is Gaussian(?0,?02)

y

?
x 1
28
y

?
?0 1, ?02 0.1, y0 -1
x 1
29
An example compositionality
30
An example compositionality
0
1

Data m event-utterance pairs
Hypotheses languages, with error ?

0
1
holistic
0
1
0
1
31
Analysis technique

Compute transition matrix on languages
Sample Markov chains
Compare language frequencies with prior
(can also compute eigenvalues etc.)

32
Convergence to priors
? 0.50, ? 0.05, m 3
Chain
Prior
? 0.01, ? 0.05, m 3
Iteration
33
The information bottleneck
? 0.50, ? 0.05, m 1
Chain
Prior
? 0.01, ? 0.05, m 3
? 0.50, ? 0.05, m 10
Iteration
34
The information bottleneck
Bottleneck affects relative stability of
languages favored by prior
35
Outline

Analyzing iterated learning
Iterated Bayesian learning
Examples
Iterated learning with humans
Conclusions and open questions

36
A method for discovering priors

Iterated learning converges to the prior
evaluate prior by producing iterated learning

37
Iterated function learning

Each learner sees a set of (x,y) pairs
Makes predictions of y for new x values
Predictions are data for the next learner

38
Function learning in the lab
Examine iterated learning with different initial
data
39
Initial data
Iteration
1 2 3 4
5 6 7 8 9
(Kalish, 2004)
40
Outline