ContextFree Grammars for English - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

ContextFree Grammars for English

Description:

word(girl,n). word(slept,vi). in the S network, there is. an ... girl slept NP2 S2 Recognize N. slept NP3 S2 Pop to S-network. slept S2 --- Push to VP-network ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 33
Provided by: ralphgr
Category:

less

Transcript and Presenter's Notes

Title: ContextFree Grammars for English


1
Context-Free Grammars for English
2
What are grammars used for in CL?
  • Grammar checking
  • Natural Language Understanding
  • (1) HP fired Carlini.
  • (2) Carlini was fired by HP.
  • (3) The most recent executive to be fired by HP
    was Carlini.
  • (4) Carlini was the most recent executive to be
    fired by HP.
  • (1)-(4) all convey the same basic fact, but you
    need to know the grammatical relationship among
    these sentences to capture that.
  • This sort of info is used in Machine Translation,
    Question-Answering, Information Extraction, etc.

3
Grammars, Recognition and Parsing
4
Sentence Recognition vs. Sentence Parsing
  • Recognizer
  • ?- recognize(the,girl,waved).
  • yes
  • ?- recognize(waved,girl,the).
  • no
  • Parser
  • ?- parse(the,girl,waved,Parse).
  • Parse det,n,vi
  • yes

5
A Sentence Grammar vs. A Sentence Parser
  • Grammar - a systematic description of the
    structures that underlie the sentences of a
    language
  • Parser - a program for determining the structures
    for particular sentences
  • capitalizes on the fact that an infinite number
    of sentences can be captured with a finite number
    of structures

6
Grammar vs. Recognizer Parser
  • A Grammar is a declarative description
  • It states the conditions for a string to be
    valid.
  • A Recognizer is a procedural process.
  • It determines whether a particular sentence is
    valid.
  • A Parser is a procedural process.
  • It determines what the structure of a
    particular sentence is.

7
The Chomsky Hierarchy
  • Chomsky (1959) identified four classes of
    grammars in terms of their constraints
  • Unrestricted phrase structure grammars (type 0)
  • Context-sensitive grammars (type 1)
  • Context-free grammars (type 2)
  • Regular grammars (type 3)
  • Higher numbered types are more constrained and so
    can generate a smaller set of languages.

8
Why look at grammar types?
  • More constrained grammars will be easier to write
    and compute with.
  • Question here What is the most constrained
    grammar that can be used to describe a natural
    language?

9
Automata theory formal languages and formal
grammars
10
We looked at Regular Grammars in Chapter 2
  • Left regular grammars have rules of the form
  • A ? a (A is non-terminal a is terminal)
  • A ? Ba
  • A ? e (e is the empty string)
  • Right regular grammars have rules of the form
  • A ? a
  • A ? aB
  • A ? e

11
Finite State Grammars
  • Regular grammars are also called finite state
    grammars, which we saw used for morphological
    analysis
  • In a regular grammar, or FSA, the only
    information we need to know to generate or
    recognize a sentence is the state we are in.
  • We do not need to know anything about what we
    have already traversed in order to finish the
    sentence.

12
A Finite State Grammar
adj

adj
n
det
vi
3
n
Pn/pro
13
One Configuration ofA Finite State Automaton
  • Many buildings collapsed

2
  • Pointer begins in an initial state.
  • It moves through a finite set of states.
  • As it moves it usually
  • scans a new word
  • transitions from one state to the next
  • It ends in a pre-designated final state.

14
A Finite State Grammar-based Recognizer
  • If the recognition procedure can find a
  • det
  • or adj
  • or Pn
  • or pro
  • it can move from state 1 to
  • state 2 (if det)
  • state 3 (if adj)
  • state 4 (if Pn or pro)

15
Representing the FS Transitions -a Prolog
example-
  • Grammar
  • arc(1,det,2).
  • arc(2,adj,3).
  • arc(1,adj,3).
  • arc(1,pro,4).
  • arc(1,Pn,4).
  • initial(1).
  • final(6).
  • Lexicon
  • word(the,det).
  • word(girl,n).
  • word(slept,vi).
  • word(she,pro).
  • word(saw,n).
  • word(saw,vi).
  • fstn wksht

16
Problem with FSA/Regular Grammars
  • Cannot handle long-distance dependencies
  • If S1 then S2
  • Either S1 or S2
  • The man who said that S1 is arriving tomorrow.
  • Does not recognize phrasal constituency
  • (Det N is an NP)
  • Chomsky, 1957.

17
Context-free Grammars
18
CFG Rules allow center embedding
  • S ? a S b
  • S ? e
  • This grammar generates the language
  • an bn n 0
  • which is not regular.
  • S
  • a S b
  • a S b
  • e

19
Context-Free Grammars
  • Used extensively to describe both formal
    (programming) and natural languages
  • Also referred to as Phrase Structure Grammars
  • Presented with a BNF (Backus Naur Form)
    notation
  • ltSgt ltagt ltSgt ltbgt
  • ltSgt e
  • also referred to as Backus Normal Form

20
A CFG allows multiple sub-networks within S
Here the sub-networks are NP and VP
  • S

NP
VP
21
Transitioning between S and NP
Start
NP
VP
S
End
Adj
NP
22
Transitioning between S and VP
VP
  • S

NP
Vt
VP
VP2
NP
23
Representing the network for The girl slept.
vi
VP
24
The Grammar for The girl slept.
  • Grammar
  • S network
  • initial(s,1).
  • final(s,3).
  • arc(s,1,np,2).
  • arc(s,2,vp,3).
  • NP network
  • initial(np,1).
  • final(np,3).
  • arc(np,1,det,2).
  • arc(np,2,n,3).
  • VP network
  • initial(vp,1).
  • final(vp,2).
  • arc(vp,1,vi,2).
  • Lexicon
  • word(the,det).
  • word(girl,n).
  • word(slept,vi).
  • in the S network, there is
  • an arc from state 1 to state 2
  • labeled np.
  • wksht 1.1

25
Recursion is possible when arc labels refer to
other networks
PP
PP
Det
N
  • NP

jump
P
NP
PP
The fish PPin NPthe pond PPin NPthe
mountains wksht 1.2
26
CFG Processing
When the parser finishes processing the NP, it
pops back up to S2.
  • S

NP
VP
How does the parser know how to do this?
27
Keeping track of where you are
  • A stack keeps track of where the processor should
    pop back to after successfully traversing a
    sub-network
  • When moving to a sub-network, the processor
    pushes the network node to return to onto the
    stack
  • When finished with the sub-network, the processor
    pops the network node from the stack

28
Computer Simulation of this holding process
  • String State Stack Comment
  • The girl slept S1 --- Push to NP-network
  • The girl slept NP1 S2 Recognize Det
  • girl slept NP2 S2 Recognize N
  • slept NP3 S2 Pop to S-network
  • slept S2 --- Push to VP-network
  • slept VP1 S3 Recognize V
  • --- VP2 S3 Pop to S-network
  • --- S3 success

29
Complications with CFGs
30
Agreementcauses the grammar to expand
considerably
  • S ? npsg vpsg.
  • S ? nppl vppl.
  • Npsg ? detsg nsg.
  • NPpl ? detpl npl.
  • VPsg ? visg.
  • VPpl ? vipl.
  • arc(np,2,n,3). becomes
  • arc(np,2,nsg,3)
  • arc(np,2,npl,3).
  • arc(vp,1,vi,2). becomes
  • arc(vp,1,visg,2).
  • arc(vp,1,vipl,2).

31
Subcategorization also causes expansion
  • VP ?vi.
  • VP ? vt np.
  • VP ? vl np.
  • VP ? vl adjp.
  • VP ? v2t.
  • . . .
  • . . .
  • . . .
  • arc(vp,1,vi,2).
  • arc(vp,1,vt,2, np,3).
  • arc(vp,1,vl,2,np,3).
  • arc(vp,1,vl,2,adj,3).
  • arc(vp,1,v2t,np,3,np,4).
  • . . .
  • . . .
  • . . . COMLEX Manual

32
Auxiliaries
  • Word order is not a problem
  • might have been being bothered
  • Selection is a problem with the same consequences
    as agreement and subcat
  • have seen
  • have seeing
Write a Comment
User Comments (0)
About PowerShow.com