Natural Language Processing - PowerPoint PPT Presentation

About This Presentation
Title:

Natural Language Processing

Description:

The channel introduces noise that makes it hard to recognize the ... Speech recognition (pronounciation, etc) POS tagging. Spelling correction (spelling errors) ... – PowerPoint PPT presentation

Number of Views:18
Avg rating:3.0/5.0
Slides: 25
Provided by: jimma87
Category:

less

Transcript and Presenter's Notes

Title: Natural Language Processing


1
Natural Language Processing
  • Lecture Notes 5

2
Part of Speech Tagging
  • HMM POS tagging
  • Transformation-Based Tagging
  • Methodology evaluation and data

3
Noisy Channel
  • An influential metaphor in natural language
    processing is the noisy channel (of
    communication) model
  • The channel introduces noise that makes it hard
    to recognize the true word
  • Build a model of how the channel modifies the
    word

4
Noisy Channel
  • Obvious applications include
  • Speech recognition (pronounciation, etc)
  • POS tagging
  • Spelling correction (spelling errors)
  • Not so obvious
  • Semantic analysis (intended meaning versus how
    the speaker says it and how the listener
    interprets it)
  • Machine translation
  • I.e German to English is a matter of
    uncorrupting the original signal

5
Conditional Probability
  • P(AB) your belief in A given that you know B is
    true
  • AND B is all you know that is relevant to A

6
Conditionals Defined
  • Conditionals
  • Rearranging
  • And also

7
Bayes Rule
  • Memorize this

8
Bayes and the Noisy Channel
  • In applying Bayes to the noisy channel we want to
    compute the most likely source given some
    observed (corrupt) output signal
  • Argmaxi P(SourceiSignal)
  • Often (not always) this is hard to get, so we
    apply Bayes

9
Bayes and Noisy Channel
  • So argmax this instead

10
Argmax and Bayes
  • What does this mean?
  • Argmax
  • Plug in each possible source and compute the
    corresponding probability. Pick the one with the
    highest
  • Note the denominator is the same for each source
    candidate so we can ignore it for the purposes of
    the argmax

11
Argmax and Bayes
  • Ignoring the denominator leaves us with two
    factors P(Source) and P(SignalSource)

12
Bayesian Decoding
  • P(Source) This is often referred to as a
    language model. It encodes information about the
    likelihood of particular sequences (or
    structures) independent of the observed signal.
  • P(Signal Source) This encodes specific
    information about how the channel tends to
    introduce noise. How likely is it that a given
    source would produce an observed signal.

13
Note
  • This framework is general it makes minimal
    assumptions about the nature of the application,
    the source, or the channel.
  • Now, back to POS tagging

14
Hidden Markov Models
15
An example
Short for P(planeN)0.2
16
Viterbi Algorithm
L1 should be Li on this line
17
Viterbi an example
18
Fall 2006
  • We did not cover the rest of these notes in
    class, to leave time for other topics.

19
The Brill tagger
  • An example of TRANSFORMATION-BASED LEARNING
  • Very popular (freely available, works fairly
    well)
  • A SUPERVISED method requires a tagged corpus
  • Basic idea do a quick job first (using
    frequency), then revise it using contextual rules

20
An example
  • Examples
  • It is expected to race tomorrow.
  • The race for outer space.
  • Tagging algorithm
  • Tag all uses of race as NN (most likely tag in
    the Brown corpus)
  • It is expected to race/NN tomorrow
  • the race/NN for outer space
  • Use a transformation rule to replace the tag NN
    with VB for all uses of race preceded by the
    tag TO
  • It is expected to race/VB tomorrow
  • the race/NN for outer space

21
Transformation-based learning in the Brill tagger
  • Tag the corpus with the most likely tag for each
    word
  • Choose a TRANSFORMATION that replaces an existing
    tag with a new one such that the resulting tagged
    corpus has the lowest error rate
  • Apply that transformation to the training corpus
  • Repeat 2-3
  • Return a tagger that
  • first tags using unigrams
  • then applies the learned transformations in order

22
Templates
Change tag a to tag b when
23
Examples of learned transformations
24
Available Taggers
  • Quite a few taggers are freely available
  • Brill (TBL)
  • QTAG (HMM can be trained for other languages)
  • LT POS (part of the Edinburgh LTG suite of tools)
  • See Chris Mannings Statistical NLP resources web
    page
Write a Comment
User Comments (0)
About PowerShow.com