Title: Goal
1The Translation Correction Tool English-Spanish
user studies Ariadna Font Llitjós and Jaime
Carbonell aria,jgc_at_cs.cmu.edu Language
Technologies Institute CMU
Abstract Machine
translation systems should improve with feedback
from post-editors, but none do beyond statistical
and example-base MT improving marginally if the
corrected translation is added to the parallel
training data. Rule based systems to date improve
only via manual debugging. In contrast, we
introduce a largely automated method for
capturing more information from the human
post-editor so that corrections may be performed
automatically to translation grammar rules and
lexical entries. This paper focuses on the
information capture phase and reports on an
experiment with English-Spanish translation.
Actual user statistics 29 users
who completed all 32 sentences. 83 users were
from Spain. 2/3 with no background in
Linguistics 75 with a graduate degree and 25
with a Bachelor's degree. Average translations
fixed 26,6 (over 32) Average duration 130 min
3 minutes per translation Duration range
28min-418hours Measuring user accuracy Gold
standard 10 users log files ( 300 files )
interested in high precision at the expense of
lower recall. User corrections were not always
consistent with other users. Most of the time,
when the final translations differed from gold
standard, they were still correct. On average,
users only produced 2.5 translations that were
worse than the gold standard (out of
26,6). Users got most alignments
correctly. Usability questionnaire 82 said
TCTool is user-friendly 100 said it is easy to
determine if a sentence translation is correct,
but only 88 felt that determining the source of
errors is easy. Users did not read most of the
tutorial (23-pages) Conclusions The TCTool is an
online tool that elicits guided and structured
user feedback on translations generated by a
transfer-based MT system, with the ultimate goal
of automatically improving the translation rules.
The first English-Spanish user study shows that
users can detect errors with high accuracy (89),
but have a harder time classifying error given
the MT error classification above (72). In
general, most of the problems users had were due
to not having read the instructions and tutorial.
Goal Recycle non-expert post-editing efforts to
- Refine translation rules
automatically - Improve overall translation
quality Proposed approach - User-friendly
online GUI the Translation Correction Tool
non-expert bilingual speakers (abstract away from
MT system details) MT error classification
specifically tailored to elicit the most
information possible with the least
linguistics terminology - Active Learning to
obtain minimal pairs and do feature detection -
Rule Refinement operations to automatically
modify translation rules AVENUE System
Rule-based MT system rapid development of MT
Resource-poor languages Requirements small
number of non-expert bilingual speakers to
translate and align elicitation corpus (Probst et
al. 2001) Goal learn and refine translation
rules automatically The Translation Correction
Tool v.01 MT error classification Radic
ally different approach to MT evaluation Instead
of end-users, translation experts or developers,
it needs to be tailored for non-expert bilingual
users. Hypothesis non-expert bilingual
users can accurately detect an error in the
machine translated sentence, given the source
language sentence and, optionally, some
context. they can also probably indicate which
other word(s) in the target sentence give us the
clue about why there is an error. Example in
agreement errors, what is the word it needs to
agree with. English-Spanish User
Studies Purpose threefold test naïve users
ability to detect and classify MT errors
assess GUI usefulness and user-friendliness
asses appropriateness of MT error
classification 32 English sentences extracted
from the AVENUE elicitation corpus Transfer MT
system included a hand-crafted grammar with 12
rules and 442 lexical entries Correction Example
with the TCTool Output from MT
system Users need to
correct the Spanish translation so that words are
in the right form and in the right order. Note
that an alignment is missing from I to vi,
so users should also add an alignment between
these two words.
Version 01 has 5 CGI scripts in Perl and 1
JavaScript, which together produce a total of 8
different HTML pages. This simplified data flow
diagram shows how the core of the TCTool works.
- Set of possible actions to correct a sentence
using the TCTool - modify a word set of error types associated
with it - add a word
- delete a word
- drag a word into a different position (change
word order) - add an alignment
- delete an alignment
Future Work Interactive dynamic tutorial Need
higher precision in error classification
Refine MT error classification as shown in the
snapshot on the right. examples added
drop-down menu added Analyze all user
feedback to see how we can automate the rule
refinement process. Acknowledgements The
research funded in part by NSF grant number
IIS-0121631NSF. We would also like to thank
Kenneth Sim and Patrick Milholl for the
implementation of the JavaScript.
References Flanagan, M., 1994. Error
Classification for MT Evaluation. Proceedings of
AMTA 94, pp. 65-72, 1994. Imamura, K., Sumita,
E. and Matsumoto, Y., 2003. Feedback cleaning of
Machine Translation Rules Using Automatic
Evaluation. ACL- 03 41st Annual Meeting of the
Association for Computational Linguistics, pp.
447-454, 2003. Menezes, A. and Richardson, S.
2001. A best-first alignment algorithm for
automatic extraction of transfer mappings from
bilingual corpora. Workshop on Example-Based
Machine Translation, in MT Summit VIII, pp.
35-42, 2001. Papineni, K., Roukos, S. and Ward,
T., 1998. Maximum Likelihood and Discriminative
Training of Direct Translation Models.
Proceedings of the International Conference on
Acoustics, Speech, and Signal Processing
(ICASSP-98), pp. 189-192, 1998. Probst, K.,
Brown, R., Carbonell, J., Lavie, A. Levin, and
L., Peterson, E., 2001. Design and Implementation
of Controlled Elicitation for Machine
Translation of Low-density Languages. Proceedings
of the MT2010 workshop at MT Summit 2001. Probst,
Katharina, Lori Levin, Erik Peterson, Alon Lavie,
Jaime Carbonell. 2002. MT for Resource-Poor
Languages Using Elicitation- Based Learning of
Syntactic Transfer Rules. Machine Translation,
Special Issue on Embedded MT, 17(4). 2002. Su K.,
Chang J. and Una Hsu, Y. 1995. A corpus-based
statistics-oriented two-way design for
parameterized MT systems Rationale,
Architecture and Training issues. TMI-95, 6th
Theoretical and Methodological Issues in Machine
Translation, pp. 334-353, 1995. White, J.S.,
O'Connell, T. and O'Mara, F., 1994. The ARPA MT
Evaluation Methodologies Evaluation, Lessons,
and Future Approaches. Proceedings of AMTA 94,
pp. 193-205, 1994.