HMM for POS Tagging presentation

About This Presentation

Transcript and Presenter's Notes

Title: HMM for POS Tagging

1
HMM for POS Tagging

Heng Ji
hengji_at_cs.qc.cuny.edu
Feb 4, 2008

Acknowledgement some slides from Ralph Grishman,
Nicolas Nicolov
2
Outline

HMMs and Viterbi algorithm

3
Machine Learning based POS Tagging

Statistical approaches
Machine learning of rules
Role of corpus
No corpus (hand-written)
No machine learning (hand-written)
Unsupervised learning from raw data
Supervised learning from annotated data

4
The Basic Idea

For a string of words
W w1w2w3wn
find the string of POS tags
T t1 t2 t3 tn
which maximizes P(TW)
i.e., the probability of tag string T given that
the word string was W
i.e., that W was tagged T

5
But, the Sparse Data Problem

Rich Models often require vast amounts of data
Count up instances of the string "heat oil in a
large pot" in the training corpus, and pick the
most common tag assignment to the string..
Too many possible combinations

6
POS Tagging as Sequence Classification

We are given a sentence (an observation or
sequence of observations)
Secretariat is expected to race tomorrow
What is the best sequence of tags that
corresponds to this sequence of observations?
Probabilistic view
Consider all possible sequences of tags
Out of this universe of sequences, choose the tag
sequence which is most probable given the
observation sequence of n words w1wn.

7
Getting to HMMs

We want, out of all sequences of n tags t1tn the
single tag sequence such that P(t1tnw1wn) is
highest.
Hat means our estimate of the best one
Argmaxx f(x) means the x such that f(x) is
maximized

8
Getting to HMMs

This equation is guaranteed to give us the best
tag sequence
But how to make it operational? How to compute
this value?
Intuition of Bayesian classification
Use Bayes rule to transform this equation into a
set of other probabilities that are easier to
compute

9
Goal of POS Tagging

We want the best set of tags for a sequence of
words (a sentence)
W a sequence of words
T a sequence of tags

Our Goal

Example
P((NN NN P DET ADJ NN) (heat oil in a
large pot))

10
Reminder ApplyBayes Theorem (1763)
likelihood
prior
posterior
Our Goal To maximize it!
marginal likelihood
Reverend Thomas Bayes Presbyterian minister
(1702-1761)
11
How to Count

P(WT) and P(T) can be counted from a large
hand-tagged corpus and smooth them to get
rid of the zeroes

12
Count P(WT) and P(T)

Assume each word in the sequence depends only on
its corresponding tag

13
Count P(T)
history

Make a Markov assumption and use N-grams over
tags ...
P(T) is a product of the probability of N-grams
that make it up

14
Example a Moore Machine

Goal What is the most probable sequence of
animals if you hear Moo, Hello, Quack.

Hello!
15
A Hidden Markov Model (HMM)
16
The State Space of a Moore Machine
17
Viterbi Decoding of a Moore Machine
quack
moo
hello

t0
t1
t2
t3
t4
START
1
0
0
0
0
110.9
0.90.50.1
COW
0.9
0
0.045
0
0
0.0450.30.6
0.90.30.4
0.0081
DUCK
0
0
0
0.108
0.0324
0.1080.50.6
0.3240.21
END
0
0
0
0
0.00648
18
Computing Probabilities

viterbi s, t max(s) ( viterbi s,
t-1 transition probability P(s s)
emission probability P (tokent s) )
for each s, t
record which s, t-1 contributed the maximum

19
Analyzing

Fish sleep.

20
A Simple POS HMM
21
Word Emission ProbabilitiesP ( word state )

A two-word language fish and sleep
Suppose in our training corpus,
fish appears 8 times as a noun and 4 times as a
verb
sleep appears twice as a noun and 6 times as a
verb
Emission probabilities
Noun
P(fish noun) 0.8
P(sleep noun) 0.2
Verb
P(fish verb) 0.4
P(sleep verb) 0.6

22
Viterbi Probabilities

Token 1 fish
25

Token 1 fish
26

Token 2 sleep (if fish is verb)
27

Token 2 sleep (if fish is verb)
28

Token 2 sleep (if fish is a noun)
29

Token 2 sleep (if fish is a noun)
30

Token 2 sleeptake maximum,set back pointers
31

Token 2 sleeptake maximum,set back pointers
32

Token 3 end
33

Token 3 endtake maximum,set back pointers
34

Decodefish nounsleep verb

Write a Comment

User Comments (0)

About PowerShow.com

HMM for POS Tagging PowerPoint PPT Presentation