Automating PostEditing To Improve MT Systems

About This Presentation

Title:

Automating PostEditing To Improve MT Systems

Description:

MT output: Assassinated a diplomat Russian and kidnapped other four in Bagdad ... The girl plays the guitar *la muchacha juega la guitarra. la muchacha toca la ... – PowerPoint PPT presentation

Number of Views:45

Avg rating:3.0/5.0

Slides: 127

Provided by: ariadnafo

Category:

more less

Transcript and Presenter's Notes

Title: Automating PostEditing To Improve MT Systems

1
Automating Post-Editing To Improve MT Systems
Ariadna Font Llitjós and Jaime Carbonell APE
Workshop, AMTA Boston August 12, 2006
2
Outline

Introduction
Problem and Motivation
Goal and Solution
Related Work
Theoretical Space
Online Error Elicitation
Rule Refinement Framework
MT Evaluation Framework
Conclusions

3
Problem and Motivation

MT output Assassinated a diplomat Russian and
kidnapped other four in Bagdad

4
Problem and Motivation

MT output Assassinated a diplomat Russian and
kidnapped other four in Bagdad
Could hire post-editors to correct machine
translations

5
Problem and Motivation

MT output Assassinated a diplomat Russian and
kidnapped other four in Bagdad
Could hire post-editors to correct machine
translations

Expensive time consuming

6
Problem and Motivation

MT output Assassinated a diplomat Russian and
kidnapped other four in Bagdad
Not feasible for large amounts of data (google,
yahoo, etc.)
Does not generalize to new sentences

7
Ultimate Goal

Automatically Improve MT Quality

8
Two Alternatives

Automatic learning of post-editing rules (APE)
system independent
- several thousands of sentences might need to
be corrected for the same error
Automatic Refinement of MT System
attacks the core of the problem
system dependent

9
Our Approach

Automatically Improve MT Quality
by Recycling Post-Editing Information
Back to MT Systems

10
Our Solution
SL Asesinado un diplomático ruso y
secuestrados otros cuatro en Bagdad
TL Assassinated a diplomat Russian and
kidnapped other four in Bagdad

Get non-expert bilingual speakers to provide
correction feedback online
Make correcting translations easy and fun
5-10 minutes a day ? Large amounts of correction
data

11
Our Solution
SL Asesinado un diplomático ruso y
secuestrados otros cuatro en Bagdad
TL Assassinated a diplomat Russian and
kidnapped other four in Bagdad

Get non-expert bilingual speakers to provide
correction feedback online
Make correcting translations easy and fun
5-10 minutes a day ? Large amounts of correction
data
Feed corrections back into the MT system, so that
they can be generalized
? System will translate new sentences better

12
Bilingual Post-editing

In addition to
TL sentence, and
Context information, when available
It also provides post-editors with
Source Language sentence
Alignment information

Traditional post-editing
Allegedly a harder task, but we can now get much
more data for free
13
MT Approaches

Interlingua

Semantic Analysis
Sentence Planning
Syntactic Parsing
Text Generation
Transfer Rules
Source (e.g. English)
Target (e.g. Spanish)
Direct SMT, EBMT
14
Related Work
Nishida et al. 1988 Corston-Oliver Gammon,
2003 Imamura et al. 2003 Menezes
Richardson, 2001
Brill, 1993 Gavaldà, 2000
Callison-Burch, 2004
Fixing Machine Translation
Rule Adaptation
Su et al. 1995

Our Approach

Post-editing
Non-expert user feedback Provides relevant
reference translations Generalizes over unseen
data
Allen 2003 Allen Hogan, 2000 Knight
Chander, 1994
15
System Architecture
INPUT TEXT
OUTPUT TEXT
16
Main Technical Challenge
Simple user edits to MT output
Mapping between
Blame assignment Rule Modifications Lexical
Expansions
Improved Translation Rules
17
Outline

Introduction
Theoretical Space
Error classification
Limitations
Online Error Elicitation
Rule Refinement Framework
MT Evaluation Framework
Conclusions

18
Error Typology for Automatic Rule Refinement
(simplified)

Missing word
Extra word
Wrong word order
Incorrect word

Local vs Long distance Word vs. phrase Word
change
Sense Form Selectional restrictions Idiom
Missing constraint Extra constraint
19
Error Typology for Automatic Rule Refinement
(simplified)

Missing word
Extra word
Wrong word order
Incorrect word

Local vs Long distance Word vs. phrase Word
change
Sense Form Selectional restrictions Idiom
Missing constraint Extra constraint
20
Limitations of Approach

The SL sentence needs to be fully parsed by the
translation grammar.
current approach needs rules to refine
Depend on bilingual speakers ability to detect
MT errors.

21
Outline

Introduction
Theoretical Space
Online Error Elicitation
Translation Correction Tool
Rule Refinement Framework
MT Evaluation Framework
Conclusions

22
TCTool (Demo)
Interactive elicitation of error information

Add a word
Delete a word
Modify a word
Change word order

Actions
23

Expanding the lexicon
1. OOV words/idoms
Hamas cabinet calls a truce to avoid Israeli
retaliation
? El gabinete de Hamas llama un TRUCE para
evitar la venganza israelí ? acuerda una
tregua
2. New Word Form
The children fell ? los niños cayeron
? los niños se cayeron
3. New Word Sense
The girl plays the guitar ? la muchacha juega la
guitarra
? la muchacha toca la guitarra

Refining the grammar
4. Add Agreement constraints
I see the red car ? veo el coche roja ? veo el
coche rojo
5. Add Word or Constituent
You saw the woman ? viste la mujer ? viste a la
mujer
6. Move Constituent Order
I saw you ? yo vi te ? yo te vi

25
SL Gaudí was a great artist TL Gaudí era un
artista grande
change constituent order
Change Word Order
Edit Word
Adding new form to lexicon
CTL Gaudí era un gran artista
26
Eng2Spa User Study
Interactive elicitation of error information

LREC 2004
MT error classification ? 9 linguistically-motivat
ed classes Flanagan, 1994, White et al. 1994
word order, sense, agreement error (number,
person, gender, tense), form, incorrect word and
no translation

27
Outline

Introduction
Theoretical Space
Online Error Elicitation
Rule Refinement Framework
RR operations
Formalizing Error information
Refinement Steps (Example)
MT Evaluation Framework
Conclusions

28
Types of Refinement Operations
Automatic Rule Adaptation

1. Refine a translation rule
R0 ? R1 (change R0 to make it more specific
or more general)

29
Types of Refinement Operations
Automatic Rule Adaptation

1. Refine a translation rule
R0 ? R1 (change R0 to make it more specific
or more general)

R0
una casa bonito
a nice house
R1
N gender ADJ gender
a nice house
una casa bonita
30
Types of RefinementOperations (2)
Automatic Rule Adaptation

2. Bifurcate a translation rule
R0 ? R0 (same, general rule)

R0
una casa bonita
a nice house
31
Types of RefinementOperations (2)
Automatic Rule Adaptation

2. Bifurcate a translation rule
R0 ? R0 (same, general rule)
? R1 (add a new more specific rule)

R0
una casa bonita
a nice house
32
Formalizing Error Information
Automatic Rule Adaptation

Wi error
Wi correction
Wc clue word

33
Triggering Feature Detection
Automatic Rule Adaptation

Comparison at the feature level to detect
triggering feature(s)
Delta function ?(Wi,Wi)
Examples
?(bonito,bonita) gender
?(comiamos,comia) person,number
?(mujer,guitarra) ?
If ? set is empty, need to
postulate a new binary feature (feat_i ,-)

gen masc
gen fem
34
Refinement Steps
Error Correction Elicitation
Finding Triggering Features
Blame Assignment
Rule Refinement
35
Refinement Steps
Error Correction Elicitation
Finding Triggering Features
Blame Assignment
Rule Refinement
36
1. Error Correction Elicitation
Change Word Order
Wi grande
Edit Word
Wi gran
SL Gaudí was a great artist TL Gaudí era un
artista grande
CTL Gaudí era un gran artista
37
Refinement Steps
Error Correction Elicitation

Edit Wi grande
Wi gran
2. Change Word Order artista gran ? gran artista

Finding Triggering Feature
Blame Assignment
Rule Refinement
38
Refinement Steps
Error Correction Elicitation

Edit Wi grande
Wi gran
2. Change Word Order artista gran ? gran artista

Finding Triggering Features
Blame Assignment
Rule Refinement
39
2. Finding Triggering Features
Delta function difference at the feature
level? ?(grande, gran) ?
40
2. Finding Triggering Features
Delta function difference at the feature
level? ?(grande, gran) ? ? need to
postulate a new binary feature feat1
41
2. Finding Triggering Features
Delta function difference at the feature
level? ?(grande, gran) ? ? need to
postulate a new binary feature feat1 feat1
type pre-nominal feat1 -
type post-nominal
42
2. Finding Triggering Features
Delta function difference at the feature
level? ?(grande, gran) ? new binary
feature feat1
REFINE
43
Refinement Steps
Error Correction Elicitation

Edit Wi grande
Wi gran
2. Change Word Order artista gran ? gran artista

Finding Triggering Features
grande feat1 - gran feat1
Blame Assignment
Rule Refinement
44
Refinement Steps
Error Correction Elicitation

Edit Wi grande
Wi gran
2. Change Word Order artista gran ? gran artista

Finding Triggering Features
grande feat1 - gran feat1
Blame Assignment
Rule Refinement
45
3. Blame Assignment

(from transfer and generation tree)
tree lt( S,1 (NP,2 (N,51 "GAUDI") )
(VP,3 (VB,2 (AUX,172 "ERA") )
(NP,8 (DET,03 "UN")
(N,45 "ARTISTA")
(ADJ,54
"GRANDE"))) )gt

46
Refinement Steps
Error Correction Elicitation

Edit Wi grande
Wi gran
2. Change Word Order artista gran ? gran artista

Finding Triggering Features
grande feat1 - gran feat1
Blame Assignment
NP,8 N ADJ ? ADJ N
Rule Refinement
47
Refinement Steps
Error Correction Elicitation

Edit Wi grande
Wi gran
2. Change Word Order artista gran ? gran artista

Finding Triggering Features
grande feat1 - gran feat1
Blame Assignment
NP,8 N ADJ ? ADJ N
Rule Refinement
48
4. Rule Refinement
NP,8
un artista gran
a great artist
BIFURCATE
REFINE
ADJ feat1 c
49
4. Rule Refinement
NP,8
un artista gran
a great artist
50
Refinement Steps
Error Correction Elicitation

Edit Wi grande
Wi gran
2. Change Word Order artista gran ? gran artista

Finding Triggering Features
grande feat1 - gran feat1
Blame Assignment
NP,8 (N ADJ ? ADJ N)
Rule Refinement
NP,8 ADJ N ? N ADJ NP,8 ADJ N ? ADJ N
ADJ feat 1 c
51
Correct Translation Output

NP,8 ADJ(great-grande)
feat1 -
NP,8 ADJ(great-gran)
ADJ feat1 c feat1

Gaudi era un artista grande Gaudi era un gran
artista Gaudi era un grande artista
52
Done? Not yet

NP,8 (R0) ADJ(grande)
feat1 -
NP,8 (R1) ADJ(gran)
feat1 c feat1
Need to restrict application of general rule (R0)
to just post-nominal ADJ

un artista grande un artista gran un gran artista
un grande artista
53
Add Blocking Constraint

NP,8 (R0) ADJ(grande)
feat1 - feat1 -
NP,8 (R1) ADJ(gran)
feat1 c feat1
Can we also eliminate incorrect translations
automatically?

un artista grande un artista gran un gran
artista un grande artista
54
Making the grammar tighter

If Wc artista
Add feat1 to N(artista)
Add agreement constraint to NP,8 (R0) between N
and ADJ ((N feat1) (ADJ feat1))

un artista grande un artista gran un gran
artista un grande artista
55
Generalization Power abstract feature (feat_i)
Irina is a great friend ? Irina es
una gran amiga (instead of
Irina es una amiga grande)
ADJ feat1
Juan is a great person ? Juan es una gran
persona (instead of Juan
es una persona grande)
ADJ feat1
una gran persona
a great person
56
Generalization Power

When triggering feature already exists in the
grammar/lexicon (POS, gender, number, etc.)
I see the red car ? veo un auto roja

ADJ gender N gender
o
? veo un auto rojo
gender masc
gender fem
gender masc
Refinements generalize to all lexical entries
that have that feature (gender)
gender fem
The yellow houses are his ? las casas amarillas
son suyas (before las casas amarillos son
suyas)
gender fem
We need to go to a dark cave ? tenemos que ir a
una cueva oscura (before cueva oscuro)
57
Outline

Introduction
Theoretical Space
Online Error Elicitation
Rule Refinement Framework
MT Evaluation Framework
Relevant (and free) human reference translations
Not punished by BLEU, NIST and METEOR.
Conclusions

58
MT Evaluation Framework
Multiple Human Reference Translations
relevant to specific MT system errors
Bilingual speakers
- MT systems not be punished for picking a
different synonym or morpho-syntactic variation
by BLEU and METEOR. - Similar to what Snover
et al. 2006 propose (HTER).
59
MT Evaluation continued

Did the refined MT system generate translation as
corrected by the user (CTL)?
? Simple recall 0CTL not in output, 1CTL in
output
Did the number of bad translations (implicitly
identified by users) generated by the system
decrease?
Precision at rank k (k5 in TCTool)
Did the system successfully managed to reduce
ambiguity (number of alternative translations)?
? Reduction ratio

60
Outline

Introduction
Theoretical Space
Online Error Elicitation
Rule Refinement Framework
MT Evaluation Framework
Conclusions
Impact on MT systems
Work in Progress and Contributions
Future Work

61
Impact on Transfer-Based MT
Rule Learning and other Resources
Run-Time System
INPUT TEXT
Learning Module
Handcrafted rules
Transfer System
Transfer Rules
Translation Candidate Lattice
Morpho-logical analyzer
Lexical Resources
OUTPUT TEXT
62
Impact on Transfer-Based MT
Rule Learning and other Resources
Run-Time System
Rule Refinement
INPUT TEXT
Online Translation Correction Tool
Learning Module
Handcrafted rules
Transfer System
Transfer Rules
Translation Candidate Lattice
Morpho-logical analyzer
Lexical Resources
63
Impact on Transfer-Based MT
Rule Learning and other Resources
Run-Time System
Rule Refinement
INPUT TEXT
Learning Module
Handcrafted rules
Transfer System
Transfer Rules
Translation Candidate Lattice
Morpho-logical analyzer
Lexical Resources
64
Impact on Transfer-Based MT
Rule Learning and other Resources
Run-Time System
Rule Refinement
INPUT TEXT
Learning Module
Handcrafted rules
Transfer System
Transfer Rules
Translation Candidate Lattice
Morpho-logical analyzer
Lexical Resources
65
Impact on Transfer-Based MT
Rule Learning and other Resources
Run-Time System
Rule Refinement
INPUT TEXT
Online Translation Correction Tool
Learning Module
Handcrafted rules
Transfer System
Transfer Rules
Translation Candidate Lattice
Morpho-logical analyzer
Lexical Resources
OUTPUT TEXT
66
TCTool can help improve

Rule-based MT (grammar, lexicon, LM)
EBMT (examples, lexicon, alignments)
Statistical MT (lexicon, and alignments)
relevant annotated data
? develop smarter training algorithms

Panel this afternoon 345-445pm
67
Work in Progress

Finalizing regression and test sets to perform
rigorous evaluation of approach
Handling incorrect Correction Instances
Have multiple users correct the same set of
sentences
? filter out noise (threshold 90 users agree)
User study with multiple users
? evaluate improvement after refinements

68
Contributions so far

An efficient online GUI to display translations
and alignments and solicit pinpoint fixes from
non-expert bilingual users.
New Framework to improve MT quality an
expandable set of rule refinement operations
MT Evaluation Framework, which provides relevant
reference translations

69
Future work

Explore other ways to make the interaction with
users more fun
Games with a purpose
Von Ahn and Blum, 2004 2006

70
Future work

Explore other ways to make the interaction with
users more fun
Games with a purpose
Von Ahn and Blum, 2004 2006
Second Language Learning

71
Questions?
72
(No Transcript)
73
Backup slides

TCTool next step
What if users corrections are different (user
noise)?
More than one correction per sentence?
Wc example
Data set
Where is appropriate to refine vs bifurcate?
Lexical bifurcate
Is the refinement process invariable?

74
Backup slides (ii)

TCTool Demo Simulation
RR operation patterns
Automatic Evaluation feasibility study
AMTA paper results
User studies map
Precision, recall, F1
NIST, BLEU, METEOR

75
Backup slides (iii)

Done? Not yet (blocking constraints)
Minimal pair example
Batch mode implementation
Interactive mode implementation
User studies
Evaluation of refined output
Another Generalization Example
Avenue Architecture
Technical Challenges
Rules Formalism

76
Player The Teacher

Goal teach the MT system to translate correctly
Different levels of expertise
Beginner language learning and improving
(labeled data)
Intermediate-Advanced provide labeled data to
improve the system and for beginner levels

77
Two players cooperation
MT

In English to Spanish MT
Player 1 Spanish native speaker learning
English
Player 2 English native speaker learning
Spanish

back to main
78
Translation rule example

NP,8
NPNP DET ADJ N -gt DET N ADJ
( (X1Y1) (X2Y3) (X3Y2)
((x0 def) (x1 def))
(x0 x3)
(y2 x3)
((y1 agr) (y2 agr))
((y3 agr) (y2 agr))
)

79
Translation rule example
Rule ID

NP,8 SL side (English) TL side (Spanish)
X0 Y0 X1 X2 X3 Y1 Y2 Y3
NPNP DET ADJ N -gt DET N ADJ
( (X1Y1) (X2Y3) (X3Y2)
((x0 def) (x1 def))
(x0 x3)
(y2 x3)
((y1 agr) (y2 agr))
((y3 agr) (y2 agr))
)

analysis transfer transfer
generation generation
80
Translation rule example

NP,8
X0 Y0 X1 X2 X3 Y1 Y2 Y3
NPNP DET ADJ N -gt DET N ADJ
( (X1Y1) (X2Y3) (X3Y2)
((x0 def) (x1 def)) passing definiteness
up
(x0 x3) X3 is the head of X0
(y2 x3) pass features of head to Y2
((y1 agr) (y2 agr)) det-noun agreement
((y3 agr) (y2 agr)) adj-noun agreement
)

back to main
81
Done? Not yet
Automatic Rule Adaptation

NP,8 (R0) ADJ(grande)
feat1 -
NP,8 (R1) ADJ(gran)
feat1 c feat1
Need to restrict application of general rule (R0)
to just post-nominal ADJ

un artista grande un artista gran un gran artista
un grande artista
82
Add Blocking Constraint
Automatic Rule Adaptation

NP,8 (R0) ADJ(grande)
feat1 - feat1 -
NP,8 (R1) ADJ(gran)
feat1 c feat1
Can we also eliminate incorrect translations
automatically?

un artista grande un artista gran un gran
artista un grande artista
83
Making the grammar tighter
Automatic Rule Adaptation

If Wc artista
Add feat1 to N(artista)
Add agreement constraint to NP,8 (R0) between N
and ADJ ((N feat1) (ADJ feat1))

un artista grande un artista gran un gran
artista un grande artista
back to main
84
Example Requiring Minimal Pair
Automatic Rule Adaptation
Proposed Work

1. Run SL sentence through the transfer engine
I see them ? veo los Correct TL los veo
2. Wi los but no Wi nor Wc
Need a minimal pair to determine appropriate
refinement
I see cars ? veo autos
3. Triggering feature(s) ?(veo los, veo
autos)
?(los,autos) pos
PRON(los)pospron N(autos)posn

back to main
85
Avenue Architecture
Elicitation
Rule Learning
Run-Time System
Rule Refinement
Morphology
Translation Correction Tool
Word-Aligned Parallel Corpus
INPUT TEXT
Run Time Transfer System
Rule Refinement Module
Elicitation Corpus
Decoder
Elicitation Tool
Lexical Resources
OUTPUT TEXT
back to main
86
Technical Challenges
Automatic Evaluation of Refinement process
Elicit minimal MT information from non-expert
users
back to main
87
Batch Mode Implementation
Automatic Rule Adaptation
Proposed Work

Given a set of user corrections, apply refinement
module.
For Refinement Operations of errors that can be
refined fully automatically using
Correction information only
2. Correction and error information

error type, clue word
88
Rule Refinement Operations
89
1. Correction info only
Rule Refinement Operations
It is a nice house Es una casa bonito
? Es una casa bonita
90
2. Correction and Error info
Rule Refinement Operations
PP ? PREP NP
I am proud of you Estoy orgullosa de tu
? Estoy orgullosa de ti
back to main
91
Interactive Mode Implementation
Automatic Rule Adaptation
Proposed Work

Extra error information is required to determine
triggering context automatically
? Need to give other relevant sentences to the
user at run-time (minimal pairs)
For Refinement Operations of errors that can be
refined fully automatically but
3. require a further user interaction

92
3. Further user interaction
Rule Refinement Operations
I see them Veo los ? Los veo
back to main
93
Refining and Adding Constraints
Proposed Work

VP,3 VP NP ? VP NP (veo los, veo autos)
VP,3 VP NP ? NP VP NP pos c pron
(los veo, autos veo)
Percolate triggering features up to the
constituent level
NP PRON ? PRON NP pos PRON pos
Block application of general rule (VP,3)
VP,3 VP NP ? VP NP NP pos (NOT pron)
veo los, veo autos (los veo, autos veo)

94
User Studies
Proposed Work

TCTool new MT classification (Eng2Spa)
Different language pair
Mapudungun or Quechua ? Spanish
Batch vs Interactive mode
Amount of information elicited
just corrections vs error information

back to main
95
Evaluation of Refined MT Output

1. Evaluate best translation ? Automatic
evaluation metrics (BLEU, NIST, METEOR)
2. Evaluate translation candidate list size ?
precision (includes parsimony)

96
1. Evaluate Best translation

Hypothesis file (translations to be evaluated
automatically)
Raw MT output
Best sentence (picked by user to be correct or
requiring the least amount of correction)
Refined MT output
Use METEOR score at sentence level to pick best
candidate from the list
? Run all automatic metrics on the new hypothesis
file using user corrections as reference
translations.

97
2. Evaluate Translation Candidate List

Precision tp binary 0,1 (1 user
correction)
tp fp total number of TLs

SL TL TL X TL X TL X TL X
SL TL X TL X TL X
SL TL TL X TL X TL X TL
SL TL TL X TL X
SL TL TL X
?
?

?
? user correction
?
?
?

1/3
1/2
0/3
1/5
1/5
back to main
98
Generalization Power

When triggering feature already exists in the
feature language (pos, gender, number, etc.)
I see them ? veo los ? los veo
- I love him ? lo amo (before amo lo)
- They called me yesterday ? me llamaron ayer
(before llamaron me ayer)
- Mary helps her with her homework
? Maria le ayuda con sus tareas
(before Maria ayuda le con sus tareas)

back to main
99
Precision, Recall and F1

Precision tp
tp fp (selected, incorrect)
Recall tp
tp fn (correct, not selected)
F1 2 PR
(P R)

back to main
100
Automatic Evaluation Metrics

BLEU averages the precision for unigram, bigram
and up to 4-grams and applies a length penalty
Papineni, 2001.
NIST instead of n-gram precision the information
gain from each n-gram is taken into account NIST
2002.
METEOR assigns most of the weight to recall,
instead of precision and uses stemming Lavie,
2004

back to main
101
Data Set
Proposed Work

Split development set (400 sentence) into
Dev set ? Run User Studies
Develop Refinement Module
Validate functionality
Test set ? Evaluate effect of Refinement
operations
Wild test set (from naturally occurring text)
Requirement need to be fully parsed by grammar

back to questions
102
Refine vs Bifurcate

Batch mode ? bifurcate (no way to tell if the
original rule should never apply)
Interactive mode ? refine (change original rule)
if can get enough evidence that original rule
never applies.
Corrections involving agreement constraints seem
to hold for all cases ? refine (Open
research question)

back to questions
103
More than one correction/sentence

A B
TL X X
A
1st X
B
2nd X
Tetris approach
to Automatic Rule Refinement
Assumption different corrections to different
words ? different error

back to questions
104
Exception structural divergences

He danced her out of the room
? La bailó fuera de la habitación
her he-danced out of the
room
? La sacó de la habitación bailando
her he-take-out of the room
dancing
Have no way of knowing that these corrections are
related
Do one error at a time, if TQ decreases over the
test set, hypothesize that its a divergence
Feed to the Rule Learner as a new (manually
corrected) training example

back to questions
105
Constituent order change

I gave him the tools
? di a él las herramientas
I-gave to him the tools
? le di las herramientas a él
him I-gave the tools to him
desired refinement
VP?VP PP(a PRON) NP VP?VP NP PP(a PRON)
? Can extract constituent information from MT
output (parse tree) and treat as one
error/correction

106
More than one correction/error

AB
TL X
Example edit and move same word
(like gran, bailó)
Occams razor
Assumption both corrections are part of the
same error

back to questions
107
Wc Example

I am proud of you
? Estoy orgullosa de tu Wcde tu?ti
I-am proud of you-nom
? Estoy orgullosa de ti
I-am proud of you-oblic
Without Wc information, would need to increase
the ambiguity of the grammar significantly!
you? ti I love you ? ti quiero (te
quiero,)
you read ? ti lees (tu lees,)

back to questions
108
Lexical bifurcate

Should the system copy all the features to the
new entry?
Good starting point
Might want to copy just a subset of features
(possibly POS-dependent)
? Open Research Question

back to questions
109
Automatic Rule Adaptation
110
Automatic Rule Adaptation
SL best TL picked by user
111
Automatic Rule Adaptation
Changing word order
112
Automatic Rule Adaptation
Changing grande into gran
113
Automatic Rule Adaptation
back to main
114
Automatic Rule Adaptation
1
2
3
115
Input to RR module
Automatic Rule Adaptation

User correction log file
Transfer engine output ( parse tree)

sl I see them tl VEO LOS tree lt((S,0 (VP,3
(VP,1 (V,12 "VEO") ) (NP,0 (PRON,23
"LOS") ) ) ) )gt
sl I see cars tl VEO AUTOS tree lt((S,0 (VP,3
(VP,1 (V,12 "VEO") ) (NP,2 (N,13
AUTOS") ) ) ) )gt
back to main
116
Types of RR Operations
Automatic Rule Adaptation
Completed Work

Grammar
R0 ? R0 R1 R0 constr CovR0 ?
CovR0,R1
R0 ? R1R0 constr -
? R2R0 constrc CovR0 ?
CovR1,R2
R0 ? R1 R0 constr CovR0 ? CovR1
Lexicon
Lex0 ? Lex0 Lex1Lex0 constr
Lex0 ? Lex1Lex0 constr
Lex0 ? Lex0 Lex1?Lex0 ? TLword
? ? Lex1 (adding lexical item)

bifurcate
refine
back to main
117
Manual vs Learned Grammars
Automatic Rule Adaptation

Manual inspection
Automatic MT Evaluation

AMTA 2004

back to main
118
Human Oracle experiment
Automatic Rule Adaptation
Completed Work

As a feasibility experiment, compared raw output
with manually corrected MT
statistically significant (confidence interval
test)
These is an upper-bound on how much difference we
should expect any refinement approach to make.

back to main
119
Invariable process?

Rule Refinement process is not invariable ? Order
of the corrected sentences input to the system
matters
Example
1st gran artista ? bifurcate (2 rules)
2nd casa bonito ? add agr constraint to only 1
rule (original, general rule)
the specific rule is still incorrect (missing agr
constraint)
1st casa bonito ? add agr constraint
2nd gran artista ? bifurcate
? both rules have agreement constraint (optimal
order)

back to questions
120
User noise?

Solution
Have several users evaluate and correct the same
test set
? threshold, 90 agreement
- correction
- error information (type, clue word)
Only modify the grammar if enough evidence of
incorrect rule.

back to questions
121
User Studies Map
Proposed Work
Batch mode
Only Corrections
Learned grammars
RR module
Correctionserror info
Eng2spa
X
X
Manual grammars
Mapu2Spa
X
interactive mode
Active learning
back to main
122
Recycle corrections of Machine Translation output
back into the system by refining and expanding
existing translation rules
123
1. Correction info only
Rule Refinement Operations
It is a nice house Es una casa bonito
? Es una casa bonita
John and Mary fell Juan y Maria ? cayeron
? Juan y Maria se cayeron
124
1. Correction info only
Rule Refinement Operations
J y M cayeron ? J y M se cayeron
Es una casa bonito ? Es una casa bonita
Gaudi was a great artist Gaudi era un artista
grande ? Gaudi era un gran artista
I will help him fix the car Ayudaré a él a
arreglar el auto ? Le ayudare a
arreglar el auto
125
1. Correction info only
Rule Refinement Operations
I would like to go Me gustaria que ir
? Me gustaria ? ir
I will help him fix the car Ayudaré a él a
arreglar el auto ? Le ayudare a
arreglar el auto
126
2. Correction and Error info
Rule Refinement Operations
PP ? PREP NP
I am proud of you Estoy orgullosa tu
? Estoy orgullosa de ti
127
Focus 3
Rule Refinement Operations
Wally plays the guitar Wally juega la
guitarra ? Wally toca la
guitarra
I saw the woman Vi ? la mujer ?
Vi a la mujer
I see them Veo los ? Los veo
128
Outside Scope of Thesis
Rule Refinement Operations
John read the book A Juan leyó el libro
? ? Juan leyó el libro
Where are you from? Donde eres tu de?
? De donde eres tu?

Write a Comment

User Comments (0)