Title: 15381 Artificial Intelligence
115-381 Artificial Intelligence
- Machine Translation and Beyond
- Jaime Carbonell
- 18-March-2003
- OUTLINE
- Types of Machine Translation
- Example-Based MT
- Multi-Engine MT
- Other NLP Challenges
2Typical NLP System
Inference/retrieval
Natural Language output
Natural Language input
Internal representation
parsing
generation
- NL Data-Base Query
- Parsing Question ? SQL query (via ATN, CF,)
- Inference/retrieval DBMS SQL ? table of
records - Generation no-op (just print the retrieved
records) - Machine Translation
- Parsing Source Language text ? Representation
- Inference/retrieval no-op
- Generation Representation ? Target language
3What Makes MT Hard?
- Word Sense
- Comer Spanish ? eat, capture, overlook
- Banco Spanish ? bank, bench
- Specificity
- Reach (up) ? atteindre French
- Reach (down) ? baisser French
- 14 words for snow in Inupiac
-
- Lexical holes
- Shadenfreuder German ? happiness in the
misery of others, no such English word
- Syntactic Ambiguity (as discussed earlier)
4Bar Hillel's Argument
- Text must be (minimally) understood before
translation can proceed effectively. - Computer understanding of text is too difficult.
- Therefore, Machine Translation is infeasible.
- - Bar Hillel (1960)
- Premise 1 is accurate
- Premise 2 was accurate in 1960
- Some forms of text comprehension are becoming
possible with present AI technology, but we have
a long way to go. Hence, Bar Hillel's conclusion
is losing its validity, but only gradually.
5Types of Machine Translation
Semantic Analysis
Sentence Planning
Transfer Rules
Text Generation
Syntactic Parsing
Source (Arabic)
Target (English)
Direct SMT, EBMT
6Transfer Grammars N(N-1)
7Interlingua Paradigm for MT (2N)
- L 1 L 1
- L2 L2
- L3 L3
- L4 L4
Semantic Representation aka interlingua
For N 72, T/G ? 5112 grammars, Interlingua ? 144
8Interlingua-Based MT
- Requires an Interlingua (language-neutral KR)
- Philosophical debate Is there an interlingua?
- FOL is not totally language neutral (predicates,
functions, expressed in a language) - Other near-interlinguas (Conceptual Dependency)
- Requires a fully-disambiguating parser
- Domain model of legal objects, actions, relations
- Requires a NL generator (KR ? text)
- Applicable only to well-defined technical domains
- Produces high-quality MT in those domains
9Conceptual Dependency (CD)
- Language Neutral Knowledge Representation
- All languages reflect same basic human thought
- Atomic Theory of Language
- Finite number of elemental concepts acts,
relations - gt atoms in CD
- Virtually infinite combinations gt molecules
- History
- Invented by Roger Schank in the 1970s
- Never completed (best developed for verbs)
- Inspired practical domain-specific interlinguas
10Conceptual Dependency Examples
- ATRANS ATRANS
- rel POSSESSION rel POSSESSION
- actor JOHN actor MARY
- object BALL object BALL
- source JOHN source JOHN
- recipient MARY recipient MARY
- "John gave Mary a ball" "Mary took the ball
from John" - ATRANS ATRANS
- rel OWNERSHIP CAUSE rel OWNERSHIP
- actor JOHN actor MARY
- object APPLE object 25 CENTS
- source JOHN CAUSE source MARY
- recipient MARY recipient JOHN
- "John sold an apple to Mary for 25
Cents."
11Conceptual Dependency
- Other conceptual dependency primitive actions
include - PTRANS--Physical transfer of location
- MTRANS--Mental transfer of information
- MBUILD--Create a new idea/conclusion from other
info - INGEST--Bring any substance into the body
- PROPEL--Apply a force to an object
- States and causal relations are also part of the
representation - ENABLE (State enables an action)
- RESULT (An action results in a state change)
- INITIATE (State or action initiates mental
state) - REASON (Mental state is the internal reason for
an action) - PROPEL STATECHANGE
- actor JOHN CAUSE state PHYSICALINTEGRITY
- object HAMMER object WINDOW
- direction WINDOW endpoint -10
- "John broke the window with a hammer"
12Example-Based MT (EMBT)
- Can we use previously translated text to learn
how to translate new texts? - Yes! But, its not so easy
- Two paradigms, statistical MT, and EBMT
- Requirements
- Aligned large parallel corpus of translated
sentences - Ssource ? Starget
- Bilingual dictionary for intra-S alignment
- Generalization patterns (names, numbers, dates)
13EBMT Approaches
- Simplest Translation Memory
- If Snew Ssource in corpus, output aligned
Starget - Otherwise output ArgmaxSim(Snew,Ss/St)
- Compositional EBMT
- If fragment of Snew matches fragment of Ss,
output corresponding fragment of aligned St - Prefer maximal-length fragments
- Maximize grammatical compositionality
- Via a target language grammar,
- Or, via an N-gram statistical language model
14EBMT Example
English I would like to meet
her. Mapudungun Ayükefun trawüael fey
engu.
English The tallest man is my
father. Mapudungun Chi doy fütra chi wentru
fey ta inche ñi chaw.
English I would like to meet the
tallest man Mapudungun (new)
Ayükefun trawüael Chi doy fütra chi
wentru Mapudungun (correct) Ayüken ñi
trawüael chi doy fütra wentruengu.
15Multi-Engine Machine Translation
- MT Systems have different strengths
- Rapidly adaptable Statistical, example-based
- Good grammar Rule-Based (linguisitic) MT
- High precision in narrow domains INTERLINGUA
- Minority Language MT Learnable from informant
- Combine results of parallel-invoked MT
- Select best of multiple translations
- Selection based on optimizing combination of
- Target language joint-exponential model
- Confidence scores of individual MT engines
16Illustration of Multi-Engine MT
17Statistical Machine Translation (SMT)
- Requires parallel text as training corpus (S T)
- Requires large monolingual T language text
- Builds a statistical translation model (next
slides) - Builds a T language statistical n-gram model
GOAL For every new S sentence, compute the
maximum probability translation, given
translation T language models
18The Three Ingredients
E
F
English Language Model
English-French Translation Model
p(E)
p(FE)
F
Decoder
E
Earg max
p(FE) argmax p(E) p(FE)
19Alignments
- The
- proposal
- will
- not
- now
- be
- implemented
Les propositions ne seront pas mises en applicatio
n maintenant
Translation models built in terms of hidden
alignments
p(FE)
Each English word generates zero or more French
words.
20Breaking Things Up
- One way of conditioning joint probs (not an
approximation)
- The modeling problem is how to approximate these
terms - to achieve
- Statistically robust estimates
- Computationally efficient training
- An effective and accurate model
21The Gang of Five Models
- A series of five models is trained to bootstrap
a detailed translation model. Basic ingredients - Word-byword translation. Parameters p(fe)
between pairs of words. - Local alignments. Adds alignment probabilities
- Fertilities and distortions. Allow an English
word to generate zero or more French words, - Tablets. Groups words into phrases.
- Non-deficient alignments. Dont ask. . .
22Model 3 Example
1 2 345 6
- Peter
- does
- beat
- the
- dog
- ltnullgt
1 2 3 4 5 6
Le chien est battu par Pierre
23Model 3 Example (cont.)
1 2 345 6
- Peter
- does
- beat
- the
- dog
- ltnullgt
Le chien est battu par Pierre
1 2 3 4 5 6
24Training Mechanics
- Each model trained using the EM algorithm
- Log-likelihood for Model 1 is concave
- Model 2 counts are accumulated with O(lm) work
- Updates for Model 3 involve exponential sum. A
form of Viterbi training is used, summing over
all alignments near the most probable. - The parameters for each model are seeded with
those from the previous model.
25Example Parametersshould
26Example Parameters (cont.)former
27Best of Alignments
- What
- is
- the
- anticipated
- cost
- of
- administering
- and
- collecting
- fees
- under
- the
- new
- proposal
- ?
En vertu de les nouvelles propositions , est quel
le côut prévu de administration et de percetion de
les droits ?
28Beyond Parsing, Generation and MT
- Anaphora and Ellipsis Resolution
- Mary got a nice present from Cindy.
- It was her birthday.
- John likes oranges and Mary apples.
- John likes oranges and MacIntosh apples
- Dialog Processing
- Speech Acts (literal ? intended message)
- Do you have the time?
- Social Role context ? speech act selection
- General context sometimes needed
29Social Role Determines Interpretation
10-year old I want a juicy Hamburger! Mother
Not today, perhaps tomorrow General
I want a juicy Hamburger! Aide
Yes, sir!! Prisoner 1 I want a juicy
Hamburger! Prisoner 2 Wouldn't that be nice
for once!
30Flavors of Metaphors
- Fully Frozen Metaphors Fossilized
- No, not Joe, he's a bull in a china shop
- My old desktop 486 finally kicked the bucket.
- Port Wine Metaphors Well Aged
- John is a walking encyclopedia
- The Cadillac of stereo VCRs
- Wild Raspberry Metaphors Freshly Picked
- The stock market? What a roller coaster ride!
- The Won plummets
- The stock options are underwater
- Rum-Chantilly Creative Concoctions
- Tony Blair really pulled a Clinton today
- No, not Joe, he's a bull in a china shop. Worse
than thathow about a T-rex in a theme park? - MERIT low-tar cigarettes break the taste barrier!
31Merit Cigarette Advertisement
- Merit
- Smashes
- Taste
- Barrier.
- -National Smoker
Study - ________________________________________
- Majority of smokers confirm 'Enriched Flavor'
cigarette matches taste of leading high tar
brands. - Why do we intepret barrier-smashing as good?
- Metaphors, Metonomy, other hard stuff
32More NLP Challenges
- Automated Grammar Induction
- Supervised learning from treebank
- Feasible, and current research focus
- Unsupervised learning from corpus
- Infeasible still, and holy grail
- Automated MT Transfer Rule Induction
- Semi-supervised learning from word-aligned
bilingual corpus - At the horizon of present research
- Unsupervised learning from sentence-aligned
corpus - Indefinite future holy grail