A Bayesian Model for Discovering Typological Implications - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

A Bayesian Model for Discovering Typological Implications

Description:

je mange le diner dans les restaurants. I eat the dinner in the restaurants. Japanese: boku -wa bangohan -o resutoran -ni taberu. I -topic dinner -obj restaurants ... – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 20
Provided by: haldau
Category:

less

Transcript and Presenter's Notes

Title: A Bayesian Model for Discovering Typological Implications


1
A Bayesian Model for DiscoveringTypological
Implications
  • Hal Daumé III
  • School of Computing
  • University of Utah
  • me_at_hal3.name

Lyle Campbell Department of Linguistics Universit
y of Utah lcampbel_at_hum.utah.edu
2
A Typological What?!
VO ? PreP PostP ? OV
English I eat dinner in restaurants.
French je mange le diner dans les restaurants I e
at the dinner in the restaurants
Japanese boku -wa bangohan -o resutoran -ni taber
u I -topic dinner -obj restaurants -in eat
Hindi main raat ka khaana restra mein khaata hoon
I night-of-meal restaurants in eat am
3
The Typologist's Life
16 0 3 11
(Greenberg, 1963) Based on 30
diversely sampled languages
Now, repeat for lots of feature pairs
4
Difficulties with Typical Approach
A ? B (99) uninteresting when Ø ? B (99)?
Search process tedious
Sampling problem when many languages considered
Process is inherently noisy
5
A Typological Database
  • 2150 Languages
  • 35 language families
  • 275 language geni
  • 139 Features
  • 11 feature categories
  • Sparsely sampled
  • 85 missing data

6
Typological Map VO
7
Typological Map PreP
8
Typological Map VO and PreP
9
An Initial Model
  • Consider two features --gt 2xN matrix
  • First, generate first column withprior
    probability p1
  • Next, decide if the implication holds
  • Finally, generate the second column
  • With probability p2 if feature 1 is not or if
    the implication doesn't hold
  • Forced to be otherwise

- ? ? ? - ? -
? - - - ? - -
10
An Initial Model
  • Consider two features --gt 2xN matrix
  • First, generate first column withprior
    probability p1
  • Next, decide if the implication holds
  • Finally, generate the second column
  • With probability p2 if feature 1 is not or if
    the implication doesn't hold
  • Forced to be otherwise

- ? ? ? - ? -
? - - - ? - -
Problems Cannot handle noisy data
Doesn't address sampling problem
11
An Initial Model
  • Consider two features --gt 2xN matrix
  • First, generate first column withprior
    probability p1
  • Next, decide if the implication holds
  • Finally, generate the second column
  • With probability p2 if feature 1 is not or if
    the implication doesn't hold
  • Forced to be otherwise

- ? ? ? - ? -
? - - - ? - -
Problems Cannot handle noisy data
Doesn't address sampling problem
12
Fixing the Noise Problem
  • Assume language-specific noise
  • Model remains unchanged, excepta new variable
    causes f to be flipped

13
Fixing the Sampling Problem
  • Hierarchical Bayes prior...

14
Inference
  • Binomials get Beta priors
  • m Uniform
  • Beta with 5 mean, 0-10 with 50 probability
  • Everything else gets uniform priors
  • Inference by Gibbs sampling
  • Plus a rejection sampler subroutine

15
Three Models
Flat All languages independent
LingHier Typological Hierarchy
DistHier Obtained by clustering positionally
16
Automatically Extracting Implications
  • Search only over pairs with
  • 250 languages for which both features are known
  • 15 languages for which both hold simultaneously
  • When f1 is true, f2 is true with gt50
    probability
  • Reduces space from 19,000 to 3442
  • Sort by probability that m is true
  • Evaluate
  • Compare restorative accuracy versus each other
  • Compare against well-known implications

17
Restoration Accuracy by Model
18
Top Implications LingHier
19
Discussion
Model for automatically discovering implications
Accounts for noise and sampling problem
Different hierarchical modelsquantitatively
different
Discovered implicationscorrelated with known ones
Many worthy of further exploration http//hal3.n
ame/WALS
Write a Comment
User Comments (0)
About PowerShow.com