Latent Semantic Grammar Induction - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Latent Semantic Grammar Induction

Description:

Element's distribution is its sum of environments ... University of Memphis. aolney_at_memphis.edu. Experiment 2: Results. Experiment 2: Method ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 36
Provided by: andrewo152
Category:

less

Transcript and Presenter's Notes

Title: Latent Semantic Grammar Induction


1
Latent Semantic Grammar Induction
  • Context, Projectivity, and Prior Distributions

2
Supervised Grammar Induction
  • Works great
  • But
  • Expensive
  • Time consuming
  • Area specific

3
Unsupervised Grammar Induction
  • A more general solution
  • Can explore knowledge required
  • POS/Hidden syntax
  • Projectivity
  • Prior distributions
  • Semantics

4
Basic Model
  • Unigram/Bigram LSA-like spaces
  • No POS
  • Has semantics
  • Minimum spanning tree parsing
  • Nonprojective
  • No hidden syntax

5
Distributional Analysis
  • Elements distribution is its sum of environments
  • Environment is array of co-occurrents and their
    position

6
Substitutability
  • If unigram substitutes for bigram, then bigram
    must be a constituent

7
Context
  • Global
  • D1 now the boy walks home
  • D2 a boy likes a girl

8
Context
  • Local
  • _ target _
  • now the boy walks home
  • a boy likes a girl

9
Singular value decomposition
  • Approximates a co-occurrence analysis via
    compression
  • Terms in compressed space can become similar even
    if they didn't co-occur
  • Atxd Ttxn Snxn Dnxd
  • LSA Schutze POS

10
Search
  • Replace bigrams with unigrams
  • Cosine as substitutability measure
  • Many possible replacements
  • Use Chu-Liu-Edmonds
  • Finds minimum directed spanning tree
  • O(n2)
  • Requires a graph to operate on

11
Populating the Graph
ROOT
sleeps
boy
the
12
Populating the Graph
the boy sleeps
13
Populating the Graph
the boy sleeps
john
14
Populating the Graph
the boy sleeps
cos(boy,john).34 gt cos(the,john).02
john
boy is head
15
Populating the Graph
the boy sleeps
16
Populating the Graph
the boy sleeps
cos(the,sleeps).1 lt cos(sleeps,sleeps)1.0
sleeps
sleeps is head
17
Populating the Graph
the boy sleeps
sleeps
18
Populating the Graph
the boy sleeps
cos(boy,sleeps).3 lt cos(sleeps,sleeps)1.0
sleeps
sleeps is head
19
Populating the Graph
(0.02.1.340.31.01.00)/9.306
20
Solving the Graph
ROOT
.306
.306
.306
sleeps
.1
the
1.0
.34
.3
1.0
.02
boy
21
Solving the Graph
ROOT
.306
.306
.306
.1
sleeps
the
.6
.34
.3
1.0
.02
boy
22
Solving the Graph
ROOT
.306
sleeps
the
.6
1.0
boy
23
Solving the Graph
ROOT
.40
sleeps
the
.6
1.0
boy
24
Solving the Graph
ROOT
.906
ltthe sleepsgt
.306
1.0
.9
boy
25
Solving the Graph
ROOT
.906
ltthe sleepsgt
1.0
boy
26
Solving the Graph
ROOT
.306
sleeps
the
.6
1.0
boy
27
Building Hierarchical Structure
1
2
3
5
4
28
Materials
  • Testing Penn Treebank WSJ10
  • 7422 parsed sentences
  • Collins head rules
  • Context WSJ and NANews 1994
  • 10M words, 460K sentences

29
Baselines
  • Random baseline
  • Right branching baseline
  • Left branching baseline
  • Sign Test

I like cheese
I like cheese
30
Prior Knowledge
  • Varied
  • Context
  • Projectivity
  • Prior distributions
  • Constant
  • No grammatical categories
  • LSA-like semantics

31
(No Transcript)
32
Conclusions
  • Created UGI model using vector space and minimum
    spanning tree
  • Local gt Global
  • Projectivity or prior distribution help
  • Small additive effect
  • Not enough to beat left baseline

33
Questions?
  • Andrew Olney
  • Institute for Intelligent Systems
  • University of Memphis
  • aolney_at_memphis.edu

34
Experiment 2 Results
35
Experiment 2 Method
36
Experiment 2 Method
Write a Comment
User Comments (0)
About PowerShow.com