For Friday - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

For Friday

Description:

For Friday Finish chapter 23 Homework: Chapter 22, exercise 9 – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 28
Provided by: MaryE117
Category:

less

Transcript and Presenter's Notes

Title: For Friday


1
For Friday
  • Finish chapter 23
  • Homework
  • Chapter 22, exercise 9

2
Program 5
  • Any questions?

3
Syntactic Parsing
  • Given a string of words, determine if it is
    grammatical, i.e. if it can be derived from a
    particular grammar.
  • The derivation itself may also be of interest.
  • Normally want to determine all possible parse
    trees and then use semantics and pragmatics to
    eliminate spurious parses and build a semantic
    representation.

4
Parsing Complexity
  • Problem Many sentences have many parses.
  • An English sentence with n prepositional phrases
    at the end has at least 2n parses.
  • I saw the man on the hill with a telescope on
    Tuesday in Austin...
  • The actual number of parses is given by the
    Catalan numbers
  • 1, 2, 5, 14, 42, 132, 429, 1430, 4862, 16796...

5
Parsing Algorithms
  • Top Down Search the space of possible
    derivations of S (e.g.depthfirst) for one that
    matches the input sentence.
  • I saw the man.
  • S gt NP VP
  • NP gt Det Adj N
  • Det gt the
  • Det gt a
  • Det gt an
  • NP gt ProN
  • ProN gt I

VP gt V NP V gt hit V gt took V gt saw
NP gt Det Adj N Det gt the
Adj gt e N gt man
6
Parsing Algorithms (cont.)
  • Bottom Up Search upward from words finding
    larger and larger phrases until a sentence is
    found.
  • I saw the man.
  • ProN saw the man ProN gt I
  • NP saw the man NP gt ProN
  • NP N the man N gt saw (dead end)
  • NP V the man V gt saw
  • NP V Det man Det gt the
  • NP V Det Adj man Adj gt e
  • NP V Det Adj N N gt man
  • NP V NP NP gt Det Adj N
  • NP VP VP gt V NP
  • S S gt NP VP

7
Bottomup Parsing Algorithm
  • function BOTTOMUPPARSE(words, grammar) returns
    a parse tree
  • forest ? words
  • loop do
  • if LENGTH(forest) 1 and
    CATEGORY(forest1) START(grammar) then
  • return forest1
  • else
  • i ? choose from 1...LENGTH(forest)
  • rule ? choose from RULES(grammar)
  • n ? LENGTH(RULERHS(rule))
  • subsequence ? SUBSEQUENCE(forest, i, in1)
  • if MATCH(subsequence, RULERHS(rule)) then
  • foresti...in1 / MAKENODE(RULELHS(rul
    e), subsequence)
  • else fail
  • end

8
Augmented Grammars
  • Simple CFGs generally insufficientThe dogs
    bites the girl.
  • Could deal with this by adding rules.
  • Whats the problem with that approach?
  • Could also augment the rules add constraints
    to the rules that say number and person must
    match.

9
Verb Subcategorization
10
Semantics
  • Need a semantic representation
  • Need a way to translate a sentence into that
    representation.
  • Issues
  • Knowledge representation still a somewhat open
    question
  • CompositionHe kicked the bucket.
  • Effect of syntax on semantics

11
Dealing with Ambiguity
  • Types
  • Lexical
  • Syntactic ambiguity
  • Modifier meanings
  • Figures of speech
  • Metonymy
  • Metaphor

12
Resolving Ambiguity
  • Use what you know about the world, the current
    situation, and language to determine the most
    likely parse, using techniques for uncertain
    reasoning.

13
Discourse
  • More text more issues
  • Reference resolution
  • Ellipsis
  • Coherence/focus

14
Survey of Some Natural Language Processing
Research
15
Speech Recognition
  • Two major approaches
  • Neural Networks
  • Hidden Markov Models
  • A statistical technique
  • Tries to determine the probability of a certain
    string of words producing a certain string of
    sounds
  • Choose the most probable string of words
  • Both approaches are learning approaches

16
Syntax
  • Both hand-constructed approaches and data-driven
    or learning approaches
  • Multiple levels of processing and goals of
    processing
  • Most active area of work in NLP (maybe the
    easiest because we understand syntax much better
    than we understand semantics and pragmatics)

17
POS Tagging
  • Statistical approaches--based on probability of
    sequences of tags and of words having particular
    tags
  • Symbolic learning approaches
  • One of these transformation-based learning
    developed by Eric Brill is perhaps the best known
    tagger
  • Approaches data-driven

18
Developing Parsers
  • Hand-crafted grammars
  • Usually some variation on CFG
  • Definite Clause Grammars (DCG)
  • A variation on CFGs that allow extensions like
    agreement checking
  • Built-in handling of these in most Prologs
  • Hand-crafted grammars follow the different types
    of grammars popular in linguistics
  • Since linguistics hasnt produced a perfect
    grammar, we cant code one

19
Efficient Parsing
  • Top down and bottom up both have issues
  • Also common is chart parsing
  • Basic idea is were going to locate and store
    info about every string that matches a grammar
    rule
  • One area of research is producing more efficient
    parsing

20
Data-Driven Parsing
  • PCFG - Probabilistic Context Free Grammars
  • Constructed from data
  • Parse by determining all parses (or many parses)
    and selecting the most probable
  • Fairly successful, but requires a LOT of work to
    create the data

21
Applying Learning to Parsing
  • Basic problem is the lack of negative examples
  • Also, mapping complete string to parse seems not
    the right approach
  • Look at the operations of the parse and learn
    rules for the operations, not for the complete
    parse at once

22
Syntax Demos
  • http//nlp.cs.berkeley.edu/Main.htmlresearch_over
    view
  • http//www2.lingsoft.fi/cgi-bin/engcg

23
Language Identification
  • http//rali.iro.umontreal.ca/

24
Semantics
  • Most work probably hand-constructed systems
  • Some more interested in developing the semantics
    than the mappings
  • Basic question what constitutes a semantic
    representation?
  • Answer may depend on application???

25
Possible Semantic Representations
  • Logical representation
  • Database query
  • Case grammar

26
Distinguishing Word Senses
  • Use context to determine which sense of a word is
    meant
  • Probabilistic approaches
  • Rules
  • Issues
  • Obtaining sense-tagged corpora
  • What senses do we want to distinguish?

27
Semantic Demos
  • http//www.cs.utexas.edu/users/ml/geo.html
  • http//www.cs.utexas.edu/users/ml/rest.html
  • http//www.ling.gu.se/lager/Mutbl/demo.html
Write a Comment
User Comments (0)
About PowerShow.com