Statistical Machine Translation - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

Statistical Machine Translation

Description:

One of Dr. Lin's PhD students. Did my Masters degree at U of A ... English: ... you are this day like the stars of heaven in number. ... – PowerPoint PPT presentation

Number of Views:53
Avg rating:3.0/5.0
Slides: 46
Provided by: coli89
Category:

less

Transcript and Presenter's Notes

Title: Statistical Machine Translation


1
Statistical Machine Translation
  • Translation without Understanding
  • Colin Cherry

2
Who is this guy?
  • One of Dr. Lins PhD students
  • Did my Masters degree at U of A
  • Research Area Machine Translation
  • Home town Halifax, Nova Scotia
  • Please ask questions!

3
Machine Translation
  • Translation is easy for (bilingual) people
  • Process
  • Read the text in English
  • Understand it
  • Write it down in French

4
Machine Translation
  • Translation is easy for (bilingual) people
  • Process
  • Read the text in English
  • Understand it
  • Write it down in French
  • Hard for computers
  • The human process is invisible, intangible

5
One approach Babelfish
  • A rule-based approach to machine translation
  • A 30-year-old feat in Software Eng.
  • Programming knowledge in by hand is difficult and
    expensive

6
Alternate Approach Statistics
  • What if we had a model for P(FE) ?
  • We could use Bayes rule

7
Why Bayes rule at all?
  • Why not model P(EF) directly?
  • P(FE)P(E) decomposition allows us to be sloppy
  • P(E) worries about good English
  • P(FE) worries about French that matches English
  • The two can be trained independently

8
Crime Scene Analogy
  • F is a crime scene. E is a person who may have
    committed the crime
  • P(EF) - look at the scene - who did it?
  • P(E) - who had a motive? (Profiler)
  • P(FE) - could they have done it? (CSI -
    transportation, access to weapons, alabi)
  • Some people might have great motives, but no
    means - you need both!

9
On voit Jon à la télévision
Table borrowed from Jason Eisner
10
Where will we get P(FE)?
Machine Learning Magic
Books in English
Same books, in French
P(FE) model
We call collections stored in two languages
parallel corpora or parallel texts Want to update
your system? Just add more text!
11
Our Inspiration
  • The Canadian Parliamentary Debates!
  • Stored electronically in both French and English
    and available over the Internet

12
Problem
  • How are we going to generalize from examples of
    translations?
  • Ill spend the rest of this lecture telling you
  • What makes a useful P(FE)
  • How to obtain the statistics needed for P(FE)
    from parallel texts

13
Strategy Generative Story
  • When modeling P(XY)
  • Assume you start with Y
  • Decompose the creation of X from Y into some
    number of operations
  • Track statistics of individual operations
  • For a new example X,Y P(XY) can be calculated
    based on the probability of the operations needed
    to get X from Y

14
What if?
15
New Information
  • Call this new info a word alignment (A)
  • With A, we can make a good story

The quick fox jumps over the lazy dog
Le renard rapide saut par - dessus le chien
parasseux
16
P(F,AE) Story
null The quick fox jumps over the lazy dog
17
P(F,AE) Story
null The quick fox jumps over the lazy dog
f1
f2
f3

f10
18
P(F,AE) Story
null The quick fox jumps over the lazy dog
f1
f2
f3

f10
19
P(F,AE) Story
null The quick fox jumps over the lazy dog
Le renard rapide saut par - dessus le chien
parasseux
20
P(F,AE) Story
null The quick fox jumps over the lazy dog
Le renard rapide saut par - dessus le chien
parasseux
21
Getting Pt(fe)
  • We need numbers for Pt(fe)
  • Example Pt(lethe)
  • Count lines in a large collection of aligned text

22
Where do we get the lines?
  • That sure looked like a lot of monkeys
  • Remember POS tagging w/ HMMs
  • You didnt need a tagged corpus to train a tagger
  • Well get alignments out of unaligned text by
    treating the alignment as a hidden variable
  • Generalization of ideas in HMM training called EM

23
Wheres heaven in Vietnamese?
English In the beginning God created the heavens
and the earth. Vietnamese Ban dâu Dúc Chúa Tròi
dung nên tròi dât. English God called the
expanse heaven. Vietnamese Dúc Chúa Tròi dat tên
khoang không la tròi. English you are this
day like the stars of heaven in
number. Vietnamese các nguoi dông nhu sao
trên tròi.
Example borrowed from Jason Eisner
24
Wheres heaven in Vietnamese?
English In the beginning God created the heavens
and the earth. Vietnamese Ban dâu Dúc Chúa Tròi
dung nên tròi dât. English God called the
expanse heaven. Vietnamese Dúc Chúa Tròi dat tên
khoang không la tròi. English you are this
day like the stars of heaven in
number. Vietnamese các nguoi dông nhu sao
trên tròi.
Example borrowed from Jason Eisner
25
EM Estimation Maximization
  • Assume a probability distribution (weights) over
    hidden events
  • Take counts of events based on this distribution
  • Use counts to estimate new parameters
  • Use parameters to re-weight examples.
  • Rinse and repeat

26
Alignment Hypotheses
27
Weighted Alignments
  • What well do is
  • Consider every possible alignment
  • Give each alignment a weight - indicating how
    good it is
  • Count weighted alignments as normal

28
Good grief! We forgot about P(FE)!
  • No worries, a little more stats gets us what we
    need

29
Big Example Corpus
fast car
1
voiture rapide
fast
2
rapide
30
Possible Alignments
1a
1b
2
fast car
fast
fast car
voiture rapide
rapide
voiture rapide
31
Parameters
1a
1b
2
fast car
fast
fast car
voiture rapide
rapide
voiture rapide
32
Weight Calculations
1a
1b
2
fast car
fast
fast car
voiture rapide
rapide
voiture rapide
33
Count Lines
1a
1b
2
fast car
fast
fast car
1/2
1/2
1
voiture rapide
rapide
voiture rapide
34
Count Lines
1a
1b
2
fast car
fast
fast car
1/2
1/2
1
voiture rapide
rapide
voiture rapide
35
Count Lines
1a
1b
2
fast car
fast
fast car
1/2
1/2
1
voiture rapide
rapide
voiture rapide
Normalize
36
Parameters
1a
1b
2
fast car
fast
fast car
voiture rapide
rapide
voiture rapide
37
Weight Calculations
1a
1b
2
fast car
fast
fast car
voiture rapide
rapide
voiture rapide
38
Count Lines
1a
1b
2
fast car
fast
fast car
1/4
3/4
1
voiture rapide
rapide
voiture rapide
39
Count Lines
1a
1b
2
fast car
fast
fast car
1/4
3/4
1
voiture rapide
rapide
voiture rapide
40
Count Lines
1a
1b
2
fast car
fast
fast car
1/4
3/4
1
voiture rapide
rapide
voiture rapide
Normalize
41
After many iterations
1a
1b
2
fast car
fast
fast car
0
1
1
voiture rapide
rapide
voiture rapide
42
Seems too easy?
  • What if you have no 1-word sentence?
  • Words in shorter sentences will get more weight -
    fewer possible alignments
  • Weight is additive throughout the corpus if a
    word e shows up frequently with some other word
    f, P(fe) will go up

43
Some things I skipped
  • Enumerating all possible alignments
  • Very easy with this model The independence
    assumptions save us
  • Model could be a lot better
  • Word positions
  • Multiple fs generated by the same e
  • Can actually use an HMM!

44
The Final Product
  • Now we have a model for P(FE)
  • Test it by aligning a corpus!
  • IE Find argmaxAP(AF,E)
  • Use it for translation
  • Combine with favorite model for P(E)
  • Search space of English sentences for one that
    maximizes P(E)P(FE) for a given F

45
Questions?
  • ?
Write a Comment
User Comments (0)
About PowerShow.com