Title: Introduction to Machine Translation
1Introduction to Machine Translation
- Mitch Marcus
- CIS 530
- Some slides adapted from slides by
- John Hutchins, Bonnie Dorr, Martha Palmer
2Why use computers in translation?
- Too much translation for humans
- Technical materials too boring for humans
- Greater consistency required
- Need results more quickly
- Not everything needs to be top quality
- Reduce costs
- Any one of these may justify machine translation
or computer aids
3The Early History of NLP (Hutchins) MT in the
1950s and 1960s
- Sponsored by government bodies in USA and USSR
(also CIA and KGB) - assumed goal was fully automatic quality output
(i.e. of publishable quality) dissemination - actual need was translation for information
gathering assimilation - Survey by Bar-Hillel of MT research
- criticised assumption of FAHQT as goal
- demonstrated non-feasibility of FAHQT (without
unrealisable encyclopedic knowledge bases) - advocated man-machine symbiosis, i.e. HAMT and
MAHT - ALPAC 1966, set up by disillusioned funding
agencies - compared latest systems with early unedited MT
output (IBM-GU demo, 1954), criticised for still
needing post-editing - advocated machine aids, and no further support of
MT research - but failed to identify the actual needs of
funders assimilation - therefore failed to see that output of IBM-USAF
Translator and Georgetown systems were used and
appreciated
4Consequences of ALPAC
- MT research virtually ended in US
- identification of actual needs
- assimilation vs. dissemination
- recognition that perfectionism (FAHQT) had
neglected - operational factors and requirements
- expertise of translators
- machine aids for translators
- henceforth three strands of MT
- translation tools (HAMT, MAHT)
- operational systems (post-editing, controlled
languages, domain-specific systems) - research (new approaches, new methods)
- computational linguistics born in the aftermath
5Machine Translation (Pass 0 From Intro Lectures)
6Why use computers in translation?
- Too much translation for humans
- Technical materials too boring for humans
- Greater consistency required
- Need results more quickly
- Not everything needs to be top quality
- Reduce costs
- Any one of these may justify machine translation
or computer aids - (next several slides adapted from Language Weaver)
7Statistical Machine Translation Technology
Spanish/English Bilingual Text
English Text
Statistical Analysis
Statistical Analysis
Que hambre tengo yo
8How A Statistical MT System Learns
9Translating a New Document
10Language Weaver
Source Aljazeera, January 8, 2005
11 Translingual Chat Instant Messaging
Original
Translation
12Language Weaver (Al Jazeera 8/2007)
LanguageWeaver Demo Website
13Language Weaver Hybrid Translation Technology
- Chinese Source TextSample 1 ??,???????????????
???????????????????24????1200??????,8???40??????
,???????????,??????????????????   Â
????????????????????,???????????????????????,???
???????????????    ??????????????????????????
?????,35????????3562.1??????,????????????????,?
???????????????????76?????????????????   Â
????,??????????????????????????????,?4??,???????26
.79??,??27.6???4??,? ???9??,????50.5,??????????
?????????????,???,?????????????18.8??,????
22,????105?????????1.1??,?????25? Language
Weaver Experimental Syntax MT Sample 1 Â The
motor show, has always been the' barometer' of a
national car consumption and market potential.
The Beijing International Auto Show has more than
1,200 exhibitors from 24 countries and 8 days of
receiving more than 40 million visitors, setting
a new record in China's auto show, are deeply
aware of the automobile market signals. Â "China
is one of the largest automobile market in the
world. Over the years, this phrase implies more
auto businessmen. But now, more and more facts
indicates that it is to become a reality. Â Data
from the Motor Show is very convincing. The
Beijing Qingnian Bao Report on-the-spot
investigation showed that about 35 percent of
35-year-old visitors, 62.1 percent of the
respondents said that the truck was mainly to buy
a car in the near future to collect information,
even at the exhibition may purchase or suitable
products 76 of respondents indicated in the
past two years to buy private cars. Â Since the
beginning of this year, the strong growth of the
domestic car market. According to the figures
released by the National Bureau of Statistics, in
the first four months, the country produced
267,900 vehicles, up 27.6 percent in particular,
in April, the production of 90,000 vehicles, an
increase of 50.5 over the same period last year,
setting a record high for the monthly output
growth over the past 10-odd years. In terms of
sales in the first quarter, manufacturing
enterprises in the country sold 188,000 cars, up
22 percent over the same period of last year, up
10.5 percent 11,000 vehicles, dropping by nearly
25 percent lower than the beginning of the year.
14Broadcast Monitoring BBN MAPS Language Weaver
MT
15(No Transcript)
16Three MT Approaches Direct, Transfer,
Interlingual (Vauquois triangle)
17Examples of Three Approaches
- Direct
- I checked his answers against those of the
teacher ? - Yo comparé sus respuestas a las de la
profesora - Rule check X against Y ? comparar X a Y
- Transfer
- Ich habe ihn gesehen ? I have seen him
- Rule clause agt aux obj pred ? clause agt aux
pred obj - Interlingual
- I like Mary? Mary me gusta a mÃ
- Rep BeIdent (I ATIdent (I, Mary) Likeingly)
18Direct MT Pros and Cons
- Pros
- Fast
- Simple
- Inexpensive
- Cons
- Unreliable
- Not powerful
- Rule proliferation
- Requires too much context
- Major restructuring after lexical substitution
19Transfer MT Pros and Cons
- Pros
- Dont need to find language-neutral rep
- No translation rules hidden in lexicon
- Relatively fast
- Cons
- N2 sets of transfer rules Difficult to extend
- Proliferation of language-specific rules in
lexicon and syntax - Cross-language generalizations lost
20Interlingual MT Pros and Cons
- Pros
- Portable (avoids N2 problem)
- Lexical rules and structural transformations
stated more simply on normalized representation - Explanatory Adequacy
- Cons
- Difficult to deal with terms on primitive level
universals? - Must decompose and reassemble concepts
- Useful information lost (paraphrase)
- (Is thought really language neutral??)
21MT Challenges Ambiguity
- Syntactic AmbiguityI saw the man on the hill
with the telescope - Lexical Ambiguity
- E book
- S libro, reservar
- Semantic Ambiguity
- Homographyball(E) pelota, baile(S)
- Polysemykill(E), matar, acabar (S)
- Semantic granularityesperar(S) wait, expect,
hope (E)be(E) ser, estar(S)fish(E) pez,
pescado(S)
22MT Challenges Divergences
- Meaning of two translationally equivalent phrases
is distributed differently in the two languages - Example
- English RUN INTO ROOM
- Spanish ENTER IN ROOM RUNNING
23Spanish/Arabic Divergences
24Divergence Frequency
- 32 of sentences in UN Spanish/English Corpus
(5K) - 35 of sentences in TREC El Norte Corpus (19K)
- Divergence Types
- Categorial (X tener hambre ? X have hunger)
98 - Conflational (X dar puñaladas a Z ? X stab Z)
83 - Structural (X entrar en Y ? X enter Y) 35
- Head Swapping (X cruzar Y nadando ? X swim
across Y) 8 - Thematic (X gustar a Y ? Y like X) 6
25MT Lexical Choice- WSD
- Iraq lost the battle.
- Ilakuka centwey ciessta.
- Iraq battle lost.
- John lost his computer.
- John-i computer-lul ilepelyessta.
- John computer misplaced.
26WSD with Source Language Semantic Class
Constraints
lose1(Agent, Patient competition)
ciessta lose2 (Agent, Patient physobj)
ilepelyessta
27Lexical Gaps English to Chinese
- ?
- da po - irregular pieces
- da sui - small pieces
- pie duan -line
- segments
28An Gentle Introduction to Statistical MT 1949 to
1988
29Warren Weaver 1949 Memorandum I
- Proposes Local Word Sense Disambiguation!
- If one examines the words in a book, one at a
time through an opaque mask with a hole in it one
word wide, then it is obviously impossible to
determine, one at a time, the meaning of words.
"Fast" may mean "rapid" or it may mean
"motionless" and there is no way of telling
which. - But, if one lengthens the slit in the opaque
mask, until one can see not only the central word
in question but also say N words on either side,
then, if N is large enough one can unambiguously
decide the meaning. . .
30Warren Weaver 1949 Memorandum II
- Proposes Interlingua for Machine Translation!
- Thus it may be true that the way to translate
from Chinese to Arabic, or from Russian to
Portuguese, is not to attempt the direct route,
shouting from tower to tower. Perhaps the way is
to descend, from each language, down to the
common base of human communicationthe real but
as yet undiscovered universal languageandthen
re-emerge by whatever particular route is
convenient.
31Warren Weaver 1949 Memorandum III
- Proposes Machine Translation using Information
Theory! - It is very tempting to say that a book written
in Chinese is simply a book written in English
which was coded into the "Chinese code." If we
have useful methods for solving almost any
cryptographic problem, may it not be that with
proper interpretation we already have useful
methods for translation? - Weaver, W. (1949) Translation. Repr. in
Locke, W.N. and Booth, A.D. (eds.) Machine
translation of languages fourteen essays
(Cambridge, Mass. Technology Press of the
Massachusetts Institute of Technology, 1955), pp.
15-23.
32IBM Adopts Statistical MT Approach I (early
1990s)
- In 1949, Warren Weaver proposed that statistical
techniques from the emerging field of information
theory might make it possible to use modern
digital computers to translate text from one
natural language to another automatically.
Although Weaver's scheme foundered on the rocky
reality of the limited computer resources of the
day, a group of IBM researchers in the late
1980's felt that the increase in computer power
over the previous forty years made reasonable a
new look at the applicability of statistical
techniques to translation. Thus the "Candide"
project, aimed at developing an experimental
machine translation system, was born at IBM TJ
Watson Research Center.
33IBM Adopts Statistical MT Approach II
- The Candide group adopted an information-theoreti
c perspective on the MT problem, which goes as
follows. In speaking a French sentence F, a
French speaker originally thought up a sentence E
in English, but somewhere in the noisy channel
between his brain and mouth, the sentence E got
"corrupted" to its French translation F. The task
of an MT system is to discover E argmax(E')
p(FE') p(E') that is, the MAP-optimal English
sentence, given the observed French sentence.
This approach involves constructing a model of
likely English sentences, and a model of how
English sentences translate to French sentences.
Both these tasks are accomplished automatically
with the help of a large amount of bilingual
text. - As wacky as this perspective might sound, it's no
stranger than the view that an English sentence
gets corrupted into an acoustic signal in passing
from the person's brain to his mouth, and this
perspective is now essentially universal in
automatic speech recognition.
34The Channel Model for Machine Translation
this and following 3 out of 4 slides from
original 1990 IBM MT paper
35Noisy Channel - Why useful?
- Word reordering in translation handled by P(S)
- P(S) factor frees P(T S) from worrying about
word order in the Source language - Word choice in translation handled by P (TS)
- P(T S) factor frees P(S) from worrying about
picking the right translation
36An Alignment
distortion
fertility
37Fertilities and Lexical Probabilities for not
38Fertilities and Lexical Probabilities for hear
39Schematic of Translation Model
fertility
null cepts
translation
distortion
from What's New in Statistical Machine
Translation, Kevin Knight and Philipp Koehn,
Tutorial at HLT/NAACL 2003
40How do we evaluate MT?
- Human-based Metrics
- Semantic Invariance
- Pragmatic Invariance
- Lexical Invariance
- Structural Invariance
- Spatial Invariance
- Fluency
- Accuracy Number of Human Edits required
- HTER Human Translation Error Rate
- Do you get it?
- Automatic Metrics Bleu
41BiLingual Evaluation Understudy (BLEU Papineni,
2001)
- Automatic Technique, but .
- Requires the pre-existence of Human (Reference)
Translations - Compare n-gram matches between candidate
translation and 1 or more reference translations
42Bleu Metric
Chinese-English Translation Example Candidate 1
It is a guide to action which ensures that the
military always obeys the commands of the
party. Candidate 2 It is to insure the troops
forever hearing the activity guidebook that party
direct.
Reference 1 It is a guide to action that ensures
that the military will forever heed Party
commands. Reference 2 It is the guiding
principle which guarantees the military forces
always being under the command of the
Party. Reference 3 It is the practical guide for
the army always to heed the directions of the
party.
43Bleu Metric
Chinese-English Translation Example Candidate 1
It is a guide to action which ensures that the
military always obeys the commands of the
party. Candidate 2 It is to insure the troops
forever hearing the activity guidebook that party
direct.
Reference 1 It is a guide to action that ensures
that the military will forever heed Party
commands. Reference 2 It is the guiding
principle which guarantees the military forces
always being under the command of the
Party. Reference 3 It is the practical guide for
the army always to heed the directions of the
party.