Probabilistic Parsing - PowerPoint PPT Presentation

About This Presentation
Title:

Probabilistic Parsing

Description:

Essentially same as ordinary CFGS except that each rule has ... But dependency approach predates PSG. Tesni re, Helbig & Schenkel, Panini, ancient Greece ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 18
Provided by: Har134
Category:

less

Transcript and Presenter's Notes

Title: Probabilistic Parsing


1
Probabilistic Parsing
  • and some other approaches

2
Probabilistic CFGs
  • also known as Stochastic Grammars
  • Date back to Booth (1969)
  • Have grown in popularity with the growth of
    Corpus Linguistics

3
Probabilistic CFGs
  • Essentially same as ordinary CFGS except that
    each rule has associated with it a probability

S ? NP VP .80 S ? aux NP VP .15 S ? VP
.05 NP ? det n .20 NP ? det adj n
.35 NP ? n .20 NP ? adj n
.15 NP ? pro .10
  • Notice that P for each set of rules sums to 1

4
Probabilistic CFGs
  • Probabilities are used to calculate the
    probability of a given derivation
  • Defined as the product of the Ps of the rules
    used in the derivation
  • Can be used to choose between competing
    derivations
  • As the parse progresses (so, can determine which
    rules to try first) as an efficiency measure
  • Or at the end, as a way of disambuiguating, or
    expressing confidence in the results

5
Where do the probabilities come from?
  • Use a corpus of already parsed sentences a
    treebank
  • Best known example is the Penn Treebank
  • Marcus et al. 1993
  • Available from Linguistic Data Consortium
  • Based on Brown corpus 1m words of Wall Street
    Journal Switchboard corpus
  • Count all occurrences of each rule variation
    (e.g. NP) and divide by total number of NP rules
  • Very laborious, so of course is done automatically

6
Where do the probabilities come from?
  • Create your own treebank
  • Easy if all sentences are unambiguous just count
    the (successful) rule applications
  • When there are ambiguities, rules which
    contribute to the ambiguity have to be counted
    separately and weighted

7
Where do the probabilities come from?
  • Learn them as you go along
  • Again, assumes some way of identifying the
    correct parse in case of ambiguity
  • Each time a rule is successfully used, its
    probability is adjusted
  • You have to start with some estimated
    probabilities, e.g. all equal
  • Does need human intervention, otherwise rules
    become self-fulfilling prophecies

8
Problems with PCFGs
  • PCFGs assume that all rules are essentially
    independent
  • But, e.g. in English NP ? pro more likely when in
    subject position
  • Difficult to incorporate lexical information
  • Pre-terminal rules can inherit important
    information from words which help to make choices
    higher up the parse, e.g. lexical choice can help
    determine PP attachment

9
Probabilistic Lexicalised CFGs
  • One solution is to identify in each rule that one
    of the elements on the RHS (daughter) is more
    important the head
  • This is quite intuitive, e.g. the n in an NP
    rule, though often controversial (from linguistic
    point of view)
  • Head must be a lexical item
  • Head value is percolated up the parse tree
  • Added advantage is that PS tree has the feel of a
    dependency tree

10
(No Transcript)
11
Dependency Parsing
  • Not much different from PSG parsing
  • Grammar rules still need to be stated as A? B
    c
  • except that one daughter is identified as the
    head, e.g. A ? x h y
  • As structure is built, the trees are headed by
    h rather than A

12
Dependency grammar
  • Interest postdates PSG in CL circles
  • But dependency approach predates PSG
  • Tesnière, Helbig Schenkel, Panini, ancient
    Greece

13
Some dependency formalisms
  • Constraint grammar (Karlsson)
  • Slot Grammar (McCord)
  • Link Grammars (Sleator Temperley)
  • no name (Järvinen Tapanainen)

14
Categorial grammars
  • Ironically named, because they do away with
    traditional categories
  • Lexicon contains syntactic and semantic
    information
  • No grammar as such, just combinatory rules
  • Categories are of two types functors and
    arguments

15
Functors and arguments
  • Arguments have simple categories (taken from a
    small set of possible categories)
  • Functors are expressed as combinations of
    arguments
  • Two operators X/Y and X\Y express possibility of
    combination

16
Combination operators
  • X/Y is something which combines with a Y to its
    right to form an X
  • e.g. a determiner is an NP/N
  • a transitive verb is a VP/NP
  • X\Y is something which combines with a Y to its
    left to form an X
  • These can be combined
  • e.g. a ditransitive verb as a (VP/NP)/NP
  • a VP is an S\NP
  • Parsing consists of applying combination rules,
    e.g. X/Y Y X

17
Conclusion
  • Basic parsing approaches (without constraints)
    not practical in real applications
  • Whatever approach taken, bear in mind that the
    lexicon is the real bottleneck
  • Theres a real trade-off between coverage and
    efficiency, so its a good idea to sacrifice
    broad coverage (e.g. domain-specific parsers,
    controlled language), or use a scheme that
    minimizes the disadvantages (e.g. probabilistic
    parsing)
Write a Comment
User Comments (0)
About PowerShow.com