CPSC 503 Computational Linguistics - PowerPoint PPT Presentation

About This Presentation
Title:

CPSC 503 Computational Linguistics

Description:

Consider an attempt to top-down parse the following as an NP ' ... Earley Parsing Procedure ... To generate all parses: ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 41
Provided by: giuseppec
Category:

less

Transcript and Presenter's Notes

Title: CPSC 503 Computational Linguistics


1
CPSC 503Computational Linguistics
  • Lecture 8
  • Giuseppe Carenini

2
Knowledge-Formalisms Map
State Machines (and prob. versions) (Finite State
Automata,Finite State Transducers, Markov Models)
Morphology
Syntax
Rule systems (and prob. versions) (e.g., (Prob.)
Context-Free Grammars)
Semantics
  • Logical formalisms
  • (First-Order Logics)

Pragmatics Discourse and Dialogue
AI planners
3
Today 4/10
  • The Earley Algorithm
  • Partial Parsing Chuncking

4
Parsing with CFGs
  • Valid parse trees
  • Sequence of words

I prefer a morning flight
Parser
CFG
  • Assign valid trees covers all and only the
    elements of the input and has an S at the top

5
Parsing as Search
  • CFG
  • Search space of possible parse trees
  • S -gt NP VP
  • S -gt Aux NP VP
  • NP -gt Det Noun
  • VP -gt Verb
  • Det -gt a
  • Noun -gt flight
  • Verb -gt left
  • Aux -gt do, does
  • defines
  • Parsing find all trees that cover all and only
    the words in the input

6
Constraints on Search
  • Sequence of words
  • Valid parse trees

I prefer a morning flight
Parser
CFG (search space)
  • Search Strategies
  • Top-down or goal-directed
  • Bottom-up or data-directed

7
Problems with TD-BU-filtering
  • A typical TD, depth-first, left to right,
    backtracking strategy (with BU filtering) cannot
    deal effectively with
  • Left-Recursion
  • Ambiguity
  • Repeated Parsing
  • SOLUTION Earley Algorithm
  • (once again dynamic programming!)

8
(1) Left-Recursion
  • These rules appears in most English grammars
  • S -gt S and S
  • VP -gt VP PP
  • NP -gt NP PP

9
(2) Structural Ambiguity
Three basic kinds Attachment/Coordination/NP-brac
keting
  • I shot an elephant in my pajamas

10
(3) Repeated Work
  • Parsing is hard, and slow. Its wasteful to redo
    stuff over and over and over.
  • Consider an attempt to top-down parse the
    following as an NP
  • A flight from Indi to Houston on TWA

11
  • starts from.
  • NP -gt Det Nom
  • NP-gt NP PP
  • Nom -gt Noun
  • fails and backtracks

flight
12
  • restarts from.
  • NP -gt Det Nom
  • NP-gt NP PP
  • Nom -gt Noun
  • fails and backtracks

flight
13
  • restarts from.
  • fails and backtracks..

flight
14
  • restarts from.
  • Success!

15
  • 4
  • But.
  • 3
  • 2
  • 1

16
Dynamic Programming
  • Fills tables with solution to subproblems

Parsing sub-trees consistent with the input,
once discovered, are stored and can be reused
  • Does not fall prey to left-recursion
  • Stores ambiguous parse compactly
  • Does not do (avoidable) repeated work

17
Earley Parsing O(N3)
  • Fills a table in a single sweep over the input
    words
  • Table is length N 1 N is number of words
  • Table entries represent
  • Predicted constituents
  • In-progress constituents
  • Completed constituents and their locations

18
States
  • The table-entries are called states and express
  • what is predicted from that point
  • what has been recognized up to that point
  • Representation dotted-rules location
  • S -gt ? VP 0,0 A VP is predicted at the
    start of the sentence
  • NP -gt Det ? Nominal 1,2 An NP is in progress
    the Det goes from 1 to 2
  • VP -gt V NP ? 0,3 A VP has been found
    starting at 0 and ending at 3

19
Graphically
S -gt ? VP 0,0 NP -gt Det ? Nominal 1,2 VP
-gt V NP ? 0,3
20
Earley answer
  • Answer found by looking in the table in the right
    place.
  • The following state should be in the final column

S gt ?? 0,n
  • i.e., an S state that spans from 0 to n and is
    complete.

21
Earley Parsing Procedure
  • So sweep through the table from 0 to n in order,
    applying one of three operators to each state
  • predictor add top-down predictions to the chart
  • scanner read input and add corresponding state
    to chart
  • completer move dot to right when new constituent
    found
  • Results (new states) added to current or next set
    of states in chart
  • No backtracking and no states removed

22
Predictor
  • Intuition new states represent top-down
    expectations
  • Applied when non-part-of-speech non-terminals are
    to the right of a dot
  • S --gt VP 0,0
  • Adds new states to end of current chart
  • One new state for each expansion of the
    non-terminal in the grammar
  • VP --gt V 0,0
  • VP --gt V NP 0,0

23
Scanner (part of speech)
  • New states for predicted part of speech.
  • Applicable when part of speech is to the right of
    a dot
  • VP --gt Verb NP 0,0 ( 0 Book 1 )
  • Looks at current word in input
  • If match, adds state(s) to next chart
  • Verb --gt book NP 0,1

24
Completer
  • Intuition weve found a constituent, so tell
    everyone waiting for this
  • Applied when dot has reached right end of rule
  • NP --gt Det Nom 1,3
  • Find all states w/dot at 1 and expecting an NP
  • VP --gt V NP 0,1
  • Adds new (completed) state(s) to current chart
  • VP --gt V NP 0,3

25
Example
Book that flight
  • We should find an S from 0 to 3 that is a
    completed state

26
Example
Book that flight
27
So far only a recognizer
  • To generate all parses
  • When old states waiting for the just completed
    constituent are updated gt add a pointer from
    each updated to completed

Chart 0 .. S5 S-gt.VP 0,0 S6 VP -gt .
Verb 0,0 S7 VP -gt . Verb NP 0,0 .
Chart 1 S8 Verb -gt book . 0,1 S9 VP
-gt Verb . 0,1 S8 S10 S-gtVP. 0,1
S9 S11 VP-gtVerb . NP 0,1 ?? .
  • Then simply read off all the backpointers from
    every complete S in the last column of the table

28
Error Handling
  • What happens when we look at the contents of the
    last table column and don't find a S --gt ??
    state?
  • Is it a total loss? No...
  • Chart contains every constituent and combination
    of constituents possible for the input given the
    grammar
  • Also useful for partial parsing or shallow
    parsing used in information extraction

29
Earley and Left Recursion
  • So Earley solves the left-recursion problem
    without having to alter the grammar or
    artificially limiting the search.
  • Never place a state into the chart thats already
    there
  • Copy states before advancing them

30
Earley and Left Recursion 1
  • S -gt NP VP
  • NP -gt NP PP
  • The first rule predicts
  • S -gt ? NP VP 0,0 that adds
  • NP -gt ? NP PP 0,0
  • stops there since adding any subsequent
    prediction would be fruitless

31
Earley and Left Recursion 2
  • When a state gets advanced make a copy and leave
    the original alone
  • Say we have NP -gt ? NP PP 0,0
  • We find an NP from 0 to 2 so we create
  • NP -gt NP ? PP 0,2
  • But we leave the original state as is

32
Dynamic Programming Approaches
  • Earley
  • Top-down, no filtering, no restriction on grammar
    form
  • CKY
  • Bottom-up, no filtering, grammars restricted to
    Chomsky-Normal Form (CNF) (i.e., ?-free and each
    production either A-gt BC or A-gt a)

33
Today 4/10
  • The Earley Algorithm
  • Partial Parsing Chunking

34
Chunking
  • Classify only basic non-recursive phrases (NP,
    VP, AP, PP)
  • Find non-overlapping chunks
  • Assign labels to chunks
  • Chunk typically includes headword and pre-head
    material
  • NP The HD box that NP you VP ordered PP
    from NP Shaw VP never arrived

35
Approaches to Chunking (1) Finite-State
Rule-Based
  • Set of hand-crafted rules (no recursion!) e.g.,
    NP -gt (Det) Noun Noun
  • Implemented as FSTs (unionized/deteminized/minimiz
    ed)
  • F-measure 85-92
  • To build tree-like structures several FSTs can be
    combined Abney 96

36
Approaches to Chunking (1) Finite-State
Rule-Based
  • several FSTs can be combined

37
Approaches to Chunking (2) Machine Learning
  • A case of sequential classification
  • IOB tagging (I) internal, (O) outside, (B)
    beginning
  • Internal and Beginning for each chunk type gt
    size of tagset (2n 1) where n is the num of
    chunk types
  • Find an annotated corpus
  • Select feature set
  • Select and train a classifier

38
Context window approach
  • Typical features
  • Current / previous / following words
  • Current / previous / following POS
  • Previous chunks

39
Context window approach
  • Specific choice of machine learning approach does
    not seem to matter
  • F-measure 92-94 range
  • Common causes of errors
  • POS tagger inaccuracies
  • Inconsistencies in training corpus
  • Ambiguities involving conjunctions (e.g., late
    arrivals and cancellations/departure are common
    in winter )

40
For Next Time
  • Read Chapter 14 (Probabilistic CFG and Parsing)
  • Speaker    Lydia Kavraki, Professor, Rice
    University  Time     330 - 450 pm Venue    
    Hugh Dempster Pavilion                     
     6245 Agronomy Rd., Room 310Title       Motion
    Planning for Physical Systems
Write a Comment
User Comments (0)
About PowerShow.com