An Overview of the AVENUE Project

About This Presentation

Title:

An Overview of the AVENUE Project

Description:

My name is Lori. Transfer Rules. Direct: SMT, EBMT. AVENUE: ... Jaime Carbonell (PI), Alon Lavie (Co-PI), Lori Levin (Co-PI) Rule learning: Katharina Probst ... – PowerPoint PPT presentation

Number of Views:159

Avg rating:3.0/5.0

Slides: 83

Provided by: lsl

Category:

more less

Transcript and Presenter's Notes

Title: An Overview of the AVENUE Project

1
An Overview of the AVENUE Project

Presented by
Lori Levin
Language Technologies Institute
School of Computer Science
Carnegie Mellon University
Pittsburgh, PA USA

2
AVENUE Project

Dr. Jaime Carbonell, PI
Dr. Alon Lavie, Co-PI
Dr. Lori Levin, Co-PI
Dr. Robert Frederking
Dr. Ralf Brown
Dr. Rodolfo Vega
Mapudungun
Dr. Eliseo Cañulef
Rosendo Huisca
and others

Erik Peterson
Christian Monson
Ariadna Font Llitjós
Alison Alvarez
Roberto Aranovich
Dr. Jeff Good
Dr. Katharina Probst
Hebrew
Dr. Shuly Wintner
student

This research was funded in part by NSF grant
number IIS-0121-631.
3
MT Approaches

Interlingua
introduce-self

Sentence Planning
Semantic Analysis
Syntactic Parsing Pronoun-acc-1-sg chiamare-1sg N
Text Generation np poss-1sg name BE-pres N
Transfer Rules
AVENUE Automate Rule Learning
Source Mi chiamo Lori
Target My name is Lori
Direct SMT, EBMT
4
Approaches to MT

Direct
Works best with large parallel corpora
Millions of words
Can be done without linguistic resources
Interlingua
Useful when you are translating between more than
two languages
Requires linguistic knowledge
Transfer
Requires linguistic knowledge

5
Useful Resources for MT

Parallel corpus
Monolingual corpus
Lexicon
Morphological Analyzer (lemmatizer)
Human Linguist
Human non-linguist

6
Low Resource Situations

Indigenous languages
May lack large corpora
May lack a computational linguist
Strategic Languages
Aside from standard written Arabic and Chinese
Resource-rich language limited domain
Most of the large parallel corpora are newspaper,
parliamentary proceedings, or broadcast news
Fewer resources for conversation related to
humanitarian aid.

7
Why Machine Translation for Languages with
Limited Resources?

We are in the age of information explosion
The internetwebGoogle ? anyone can get the
information they want anytime
But what about the text in all those other
languages?
How do they read all this English stuff?
How do we read all the stuff that they put
online?
MT for these languages would Enable
Better government access to native indigenous and
minority communities
Better minority and native community
participation in information-rich activities
(health care, education, government) without
giving up their languages.
Civilian and military applications (disaster
relief)
Language preservation

8
Mixed Resource Situations

Some resources are available and others arent.

9
Omnivorous MT

Eat whatever resources are available
Eat large or small amounts of data

10
AVENUEs Inventory

Resources
Parallel corpus
Monolingual corpus
Lexicon
Morphological Analyzer (lemmatizer)
Human Linguist
Human non-linguist

Techniques
Rule based transfer system
Example Based MT
Morphology Learning
Rule Learning
Interactive Rule Refinement
Multi-Engine MT

11
The Avenue Low Resource Scenario
Elicitation
Rule Learning
Run-Time System
Rule Refinement
Morphology
Translation Correction Tool
Word-Aligned Parallel Corpus
INPUT TEXT
Run Time Transfer System
Rule Refinement Module
Elicitation Corpus
Decoder
Elicitation Tool
Lexical Resources
OUTPUT TEXT
12
The Avenue Low Resource Scenario
Elicitation
Rule Learning
Run-Time System
Rule Refinement
Morphology
Translation Correction Tool
Word-Aligned Parallel Corpus
INPUT TEXT
Run Time Transfer System
Rule Refinement Module
Elicitation Corpus
Decoder
Elicitation Tool
Lexical Resources
OUTPUT TEXT
13
The Avenue Low Resource Scenario
Elicitation
Rule Learning
Run-Time System
Rule Refinement
Morphology
Translation Correction Tool
Word-Aligned Parallel Corpus
INPUT TEXT
Run Time Transfer System
Rule Refinement Module
Elicitation Corpus
Decoder
Elicitation Tool
Lexical Resources
OUTPUT TEXT
14
The Avenue Low Resource Scenario
Elicitation
Rule Learning
Run-Time System
Rule Refinement
Morphology
Translation Correction Tool
Word-Aligned Parallel Corpus
INPUT TEXT
Run Time Transfer System
Rule Refinement Module
Elicitation Corpus
Decoder
Elicitation Tool
Lexical Resources
OUTPUT TEXT
15
AVENUE

Rules can be written by hand or learned
automatically.
Hybrid
Rule-based transfer
Statistical decoder
Multi-engine combinations with SMT and EBMT

16
AVENUE systems(Small and experimental, but
tested on unseen data)

Hebrew-to-English
Alon Lavie, Shuly Wintner, Katharina Probst
Hand-written and automatically learned
Automatic rules trained on 120 sentences perform
slightly better than about 20 hand-written rules.
Hindi-to-English
Lavie, Peterson, Probst, Levin, Font, Cohen,
Monson
Automatically learned
Performs better than SMT when training data is
limited to 50K words

17
AVENUE systems(Small and experimental, but
tested on unseen data)

English-to-Spanish
Ariadna Font Llitjos
Hand-written, automatically corrected
Mapudungun-to-Spanish
Roberto Aranovich and Christian Monson
Hand-written
Dutch-to-English
Simon Zwarts
Hand-written

18
The Avenue Low Resource Scenario
Elicitation
Rule Learning
Run-Time System
Rule Refinement
Morphology
Translation Correction Tool
Word-Aligned Parallel Corpus
INPUT TEXT
Run Time Transfer System
Rule Refinement Module
Elicitation Corpus
Decoder
Elicitation Tool
Lexical Resources
OUTPUT TEXT
19
Elicitation

Get data from someone who is
Bilingual
Literate
With consistent spelling
Not experienced with linguistics

20
English-Hindi Example
Elicitation Tool Erik Peterson
21
English-Chinese Example
Note Translator has to insert spaces between
words in Chinese.
22
English-Arabic Example
23
Purpose of Elicitation

srcsent Tú caíste
tgtsent eymi ütrünagimi
aligned ((1,1),(2,2))
context tú Juan masculino, 2a persona del
singular
comment You (John) fell
srcsent Tú estás cayendo
tgtsent eymi petu ütünagimi
aligned ((1,1),(2 3,2 3))
context tú Juan masculino, 2a persona del
singular
comment You (John) are falling
srcsent Tú caíste
tgtsent eymi ütrunagimi
aligned ((1,1),(2,2))
context tú María femenino, 2a persona del
singular
comment You (Mary) fell

Provide a small but highly targeted corpus of
hand aligned data
To support machine learning from a small data set
To discover basic word order
To discover how syntactic dependencies are
expressed
To discover which grammatical meanings are
reflected in the morphology or syntax of the
language

24
Languages

The set of feature structures with English
sentences has been delivered to the Linguistic
Data Consortium as part of the Reflex program.
Translated (by LDC) into
Thai
Bengali
Plans to translate into
Seven strategic languages per year for five
years.
As one small part of a language pack (BLARK) for
each language.

25
Languages

Spanish version in progress at New Mexico State
University (Helmreich and Cowie)
Plans to translate into Guarani
Portuguese version in progress in Brazil
(Marcello Modesto)
Plans to translate into Karitiana
200 speakers
Plans to translate into Inupiaq (Kaplan and
MacLean)

26
Previous Elicitation Work

Pilot corpus
Around 900 sentences
No feature structures
Mapudungun
Two partial translations
Quechua
Three translations
Aymara
Seven translations
Hebrew
Hindi
Several translations
Dutch

27
The Avenue Low Resource Scenario
Elicitation
Rule Learning
Run-Time System
Rule Refinement
Morphology
Translation Correction Tool
Word-Aligned Parallel Corpus
INPUT TEXT
Run Time Transfer System
Rule Refinement Module
Elicitation Corpus
Decoder
Elicitation Tool
Lexical Resources
OUTPUT TEXT
28
AVENUE Machine Translation System
SL the old man, TL ha-ish ha-zaqen NPNP
DET ADJ N -gt DET N DET ADJ ( (X1Y1) (X1Y3)
(X2Y4) (X3Y2) ((X1 AGR) 3-SING) ((X1 DEF
DEF) ((X3 AGR) 3-SING) ((X3 COUNT)
) ((Y1 DEF) DEF) ((Y3 DEF) DEF) ((Y2 AGR)
3-SING) ((Y2 GENDER) (Y4 GENDER)) )

Type information
Synchronous Context Free Rules
Alignments
x-side constraints
y-side constraints
xy-constraints,
e.g. ((Y1 AGR) (X1 AGR))

Jaime Carbonell (PI), Alon Lavie (Co-PI), Lori
Levin (Co-PI) Rule learning Katharina Probst
29
Rule Learning - Overview

Goal Acquire Syntactic Transfer Rules
Use available knowledge from the major-language
side (grammatical structure)
Three steps
Flat Seed Generation first guesses at transfer
rules flat syntactic structure
Compositionality Learning use previously learned
rules to learn hierarchical structure
Constraint Learning refine rules by learning
appropriate feature constraints

30
Flat Seed Rule Generation
31
Flat Seed Rule Generation

Create a flat transfer rule specific to the
sentence pair, partially abstracted to POS
Words that are aligned word-to-word and have the
same POS in both languages are generalized to
their POS
Words that have complex alignments (or not the
same POS) remain lexicalized
One seed rule for each translation example
No feature constraints associated with seed rules
(but mark the example(s) from which it was
learned)

32
Compositionality Learning
33
Compositionality Learning

Detection traverse the c-structure of the
English sentence, add compositional structure for
translatable chunks
Generalization adjust constituent sequences and
alignments
Two implemented variants
Safe Compositionality there exists a transfer
rule that correctly translates the
sub-constituent
Maximal Compositionality Generalize the rule if
supported by the alignments, even in the absence
of an existing transfer rule for the
sub-constituent

34
Constraint Learning
35
Constraint Learning

Goal add appropriate feature constraints to the
acquired rules
Methodology
Preserve general structural transfer
Learn specific feature constraints from example
set
Seed rules are grouped into clusters of similar
transfer structure (type, constituent sequences,
alignments)
Each cluster forms a version space a partially
ordered hypothesis space with a specific and a
general boundary
The seed rules in a group form the specific
boundary of a version space
The general boundary is the (implicit) transfer
rule with the same type, constituent sequences,
and alignments, but no feature constraints

36
Transfer and Decoding
Elicitation
Rule Learning
Run-Time System
Rule Refinement
Morphology
Translation Correction Tool
Word-Aligned Parallel Corpus
INPUT TEXT
Run Time Transfer System
Rule Refinement Module
Elicitation Corpus
Decoder
Elicitation Tool
Lexical Resources
OUTPUT TEXT
37
The Transfer Engine
38
Symbolic Decoder

System rarely finds a full parse/transfer for
complete input sentence
XFER engine produces comprehensive lattice of
segment translations
Decoder selects best combination of translation
segments
Search for optimal scoring path of partial
translations, based on multiple features
Target Language Model scores
XFER Rule Scores
Path Fragmentation
Other features
Symbolic decoding essential for scenarios where
there is insufficient data for training large
target LM
Effective Rule Scoring is crucial

39
The Avenue Low Resource Scenario
Elicitation
Rule Learning
Run-Time System
Rule Refinement
Morphology
Translation Correction Tool
Word-Aligned Parallel Corpus
INPUT TEXT
Run Time Transfer System
Rule Refinement Module
Elicitation Corpus
Decoder
Elicitation Tool
Lexical Resources
OUTPUT TEXT
40
Rule Refinement
Elicitation
Rule Learning
Run-Time System
Rule Refinement
Morphology
Translation Correction Tool
Word-Aligned Parallel Corpus
INPUT TEXT
Run Time Transfer System
Rule Refinement Module
Elicitation Corpus
Decoder
Elicitation Tool
Lexical Resources
OUTPUT TEXT
41
Interactive and Automatic Refinement of
Translation Rules

Problem Improve Machine Translation quality.
Proposed Solution Put bilingual speakers back
into the loop use their corrections to detect
the source of the error and automatically improve
the lexicon and the grammar.
Approach Automate post-editing efforts by
feeding them back into the MT system.
Automatic refinement of translation rules that
caused an error beyond post-editing.
Goal Improve MT coverage and overall quality.

42
Technical Challenges
Automatic Evaluation of Refinement process
Elicit minimal MT information from non-expert
users
43
Error Typology for Automatic Rule Refinement
(simplified)

Missing word
Extra word
Wrong word order
Incorrect word
Wrong agreement

44
TCTool (Demo)
Interactive elicitation of error information

Add a word
Delete a word
Modify a word
Change word order

Actions
45
Types of Refinement Operations
Automatic Rule Adaptation

1. Refine a translation rule
R0 ? R1 (change R0 to make it more specific
or more general)

R0
una casa bonito
a nice house
R1
N gender ADJ gender
a nice house
una casa bonita
46
Types of Refinement Operations
Automatic Rule Adaptation

2. Bifurcate a translation rule
R0 ? R0 (same, general rule)
? R1 (add a new more specific rule)

R0
una casa bonita
a nice house
R1
ADJ type pre-nominal
un gran artista
a great artist
47
Automatic Rule Adaptation
A concrete example
Error Information Elicitation

error
Change word order SL Gaudí was a great artist
MT system output TL Gaudí era un artista
grande Ucorrection Gaudí era un artista
grande Gaudí era un gran artista
correction
clue word
Refinement Operation Typology
48
Mapudungun

Indigenous Language of Chile and Argentina
1 Million Mapuche Speakers

49
Mapudungun Language

900,000 Mapuche people
At least 300.000 speakers of Mapudungun
Polysynthetic
sl pe- rke- fi- ñ
Maria
ver-REPORT-3pO-1pSgS/IND
tl DICEN QUE LA VI A MARÍA
(They say that) I saw Maria.

50
AVENUE Mapudungun

Joint project between Carnegie Mellon University,
the Chilean Ministry of Education, and
Universidad de la Frontera.

51
Mapudungun to Spanish Resources

Initially
Large team of native speakers at Universidad de
la Frontera, Temuco, Chile
Some knowledge of linguistics
No knowledge of computational linguistics
No corpus
A few short word lists
No morphological analyzer
Later Computational Linguists with non-native
knowledge of Mapudungun
Other considerations
Produce something that is useful to the
community, especially for bilingual education
Experimental MT systems are not useful

52
Mapudungun
Corpus 170 hours of spoken Mapudungun
Example Based MT
Spelling checker
Elicitation
Rule Learning
Run-Time System
Rule Refinement
Morphology
Translation Correction Tool
Word-Aligned Parallel Corpus
INPUT TEXT
Run Time Transfer System
Rule Refinement Module
Elicitation Corpus
Decoder
Elicitation Tool
Lexical Resources
OUTPUT TEXT
Spanish Morphology from UPC, Barcelona
53
Mapudungun Products

http//www.lenguasamerindias.org/
Click traductor mapudungún
Dictionary lookup (Mapudungun to Spanish)
Morphological analysis
Example Based MT (Mapudungun to Spanish)

54
I Didnt see Maria
S
S
VP
VP
NP
NP
a
V
VSuffG
V
no
VSuffG
VSuff
N
pe
vi
N
VSuffG
VSuff
ñ
Maria
María
fi
VSuff
la
55
Transfer to Spanish Top-Down
S
S
VP
VP
VPVP VBar NP -gt VBar "a" NP ( (X1Y1) (X2
Y3) ((X2 type) (NOT personal)) ((X2
human) c ) (X0 X1) ((X0 object) X2)
(Y0 X0) ((Y0 object) (X0 object)) (Y1
Y0) (Y3 (Y0 object)) ((Y1 objmarker person)
(Y3 person)) ((Y1 objmarker number) (Y3
number)) ((Y1 objmarker gender) (Y3 ender)))
NP
NP
a
V
VSuffG
VSuffG
VSuff
N
pe
VSuffG
VSuff
ñ
Maria
fi
VSuff
la
56
Mapudungun

Indigenous Language of Chile and Argentina
1 Million Mapuche Speakers

57
Collaboration
Eliseo Cañulef Rosendo Huisca Hugo Carrasco
Hector Painequeo Flor Caniupil Luis Caniupil
Huaiquiñir Marcela Collio Calfunao Cristian
Carrillan Anton Salvador Cañulef

Mapuche Language Experts
Universidad de la Frontera (UFRO)
Instituto de Estudios Indígenas (IEI)
Institute for Indigenous Studies
Chilean Funding
Chilean Ministry of Education (Mineduc)
Bilingual and Multicultural Education Program

Carolina Huenchullan Arrúe Claudio Millacura
Salas
58
Accomplishments

Corpora Collection
Spoken Corpus
Collected Luis Caniupil Huaiquiñir
Medical Domain
3 of 4 Mapudungun Dialects
120 hours of Nguluche
30 hours of Lafkenche
20 hours of Pwenche
Transcribed in Mapudungun
Translated into Spanish
Written Corpus
200,000 words
Bilingual Mapudungun Spanish
Historical and newspaper text

nmlch-nmjm1_x_0405_nmjm_00 M ltSPAgtno pütokovilu
kay ko C no, si me lo tomaba con agua M
chumgechi pütokoki femuechi pütokon pu ltNoisegt
C como se debe tomar, me lo tomé
pués nmlch-nmjm1_x_0406_nmlch_00 M
Chengewerkelafuymiürke C Ya no estabas como
gente entonces!
59
Accomplishments

Developed At UFRO
Bilingual Dictionary with Examples
1,926 entries
Spelling Corrected Mapudungun Word List
117,003 fully-inflected word forms
Segmented Word List
15,120 forms
Stems translated into Spanish

60
Accomplishments

Developed at LTI using Mapudungun language
resources from UFRO
Spelling Checker
Integrated into OpenOffice
Hand-built Morphological Analyzer
Prototype Machine Translation Systems
Rule-Based
Example-Based
Website LenguasAmerindias.org

61
AVENUE Hebrew

Joint project of Carnegie Mellon University and
University of Haifa

62
Hebrew Language

Native language of about 3-4 Million in Israel
Semitic language, closely related to Arabic and
with similar linguistic properties
RootPattern word formation system
Rich verb and noun morphology
Particles attach as prefixed to the following
word definite article (H), prepositions
(B,K,L,M), coordinating conjuction (W),
relativizers (,K)
Unique alphabet and Writing System
22 letters represent (mostly) consonants
Vowels represented (mostly) by diacritics
Modern texts omit the diacritic vowels, thus
additional level of ambiguity bare word ? word
Example MHGR ? mehager, mhagar, mhger

63
Hebrew Resources

Morphological analyzer developed at Technion
Constructed our own Hebrew-to-English lexicon,
based primarily on existing Dahan H-to-E and
E-to-H dictionary
Human Computational Linguists
Native Speakers

64
Hebrew
Elicitation
Rule Learning
Run-Time System
Rule Refinement
Morphology
Translation Correction Tool
Word-Aligned Parallel Corpus
INPUT TEXT
Run Time Transfer System
Rule Refinement Module
Elicitation Corpus
Decoder
Elicitation Tool
Lexical Resources
OUTPUT TEXT
65
Flat Seed Rule Generation
66
Compositionality Learning
67
Constraint Learning
68
Challenges for Hebrew MT

Paucity in existing language resources for Hebrew
No publicly available broad coverage
morphological analyzer
No publicly available bilingual lexicons or
dictionaries
No POS-tagged corpus or parse tree-bank corpus
for Hebrew
No large Hebrew/English parallel corpus
Scenario well suited for CMU transfer-based MT
framework for languages with limited resources

69
Hebrew Morphology Example

Input word BWRH
0 1 2 3 4
--------BWRH--------
-----B-----WR--H--
--B---H----WRH---

70
Hebrew Morphology Example

Y0 ((SPANSTART 0) Y1 ((SPANSTART 0)
Y2 ((SPANSTART 1)
(SPANEND 4) (SPANEND
2) (SPANEND 3)
(LEX BWRH) (LEX B)
(LEX WR)
(POS N) (POS
PREP)) (POS N)
(GEN F)
(GEN M)
(NUM S)
(NUM S)
(STATUS ABSOLUTE))
(STATUS ABSOLUTE))
Y3 ((SPANSTART 3) Y4 ((SPANSTART 0)
Y5 ((SPANSTART 1)
(SPANEND 4) (SPANEND
1) (SPANEND 2)
(LEX LH) (LEX
B) (LEX H)
(POS POSS)) (POS
PREP)) (POS DET))
Y6 ((SPANSTART 2) Y7 ((SPANSTART 0)
(SPANEND 4) (SPANEND
4)
(LEX WRH) (LEX
BWRH)
(POS N) (POS
LEX))
(GEN F)
(NUM S)

71
Sample Output (dev-data)

maxwell anurpung comes from ghana for israel four
years ago and since worked in cleaning in hotels
in eilat
a few weeks ago announced if management club
hotel that for him to leave israel according to
the government instructions and immigration
police
in a letter in broken english which spread among
the foreign workers thanks to them hotel for
their hard work and announced that will purchase
for hm flight tickets for their countries from
their money

72
Quechua?Spanish MT

V-Unit funded Summer project in Cusco (Peru)
June-August 2005 preparations and data
collection started earlier
Intensive Quechua course in Centro Bartolome de
las Casas (CBC)
Worked together with two Quechua native and one
non-native speakers on developing infrastructure
(correcting elicited translations, segmenting and
translating list of most frequent words)

73
Quechua ? Spanish Prototype MT System

Stem Lexicon (semi-automatically generated) 753
lexical entries
Suffix lexicon 21 suffixes
(150 Cusihuaman)
Quechua morphology analyzer
25 translation rules
Spanish morphology generation module
User-Studies 10 sentences, 3 users (2 native, 1
non-native)

74
Quechua facts

Agglutinative language
A stem can often have 10 to 12 suffixes, but it
can have up to 28 suffixes
Supposedly clear cut boundaries, but in reality
several suffixes change when followed by certain
other suffixes
No irregular verbs, nouns or adjectives
Does not mark for gender
No adjective agreement
No definite or indefinite articles (topic and
focus markers perform a similar task of
articles and intonation in English or Spanish)

75
Quechua examples

takini (also written takiniy)
sing 1sg (I sing) ? canto
takishani (takishaniy)
sing progr 1sg (I am singing) ? estoy
cantando
takipakuqchu?
taki sing
-paku to join a group to do something
-q agentive
-chu interrogative
? (para) cantar con la gente (del pueblo)?
(to sing with the people (of the village)?)

76
Quechua Resources

A few native speakers, not linguists
A computational linguist learning Quechua
Two fluent, but non-native linguists

77
Quechua
Parallel Corpus OCR with correction
Elicitation
Rule Learning
Run-Time System
Rule Refinement
Morphology
Translation Correction Tool
Word-Aligned Parallel Corpus
INPUT TEXT
Run Time Transfer System
Rule Refinement Module
Elicitation Corpus
Decoder
Elicitation Tool
Lexical Resources
OUTPUT TEXT
78
Grammar rules
cantando

takishani -gt estoy cantando (I am singing)
VBar,3
VBarVBar V VSuff VSuff -gt V V
( (X1Y2)
((x0 person) (x3 person))
((x0 number) (x3 number))
((x2 mood) c ger)
((y2 mood) (x2 mood))
((y1 form) c estar)
((y1 person) (x3 person))
((y1 number) (x3 number))
((y1 tense) (x3 tense))
((x0 tense) (x3 tense))
((y1 mood) (x3 mood))
((x3 inflected) c )
((x0 inflected) ))

Spanish Morphology Generation
lex cantar mood ger
lex estar person 1 number sg tense
pres mood ind
estoy
79
Hindi Resources

Large statistical lexicon from the Linguistic
Data Consortium (LDC)
Parallel Corpus from LDC
Morphological Analyzer-Generator from LDC
Lots of native speakers
Computational linguists with little or no
knowledge of Hindi
Experimented with the size of the parallel corpus
Miserly and large scenarios

80
Hindi
EBMT
Parallel Corpus
SMT
Elicitation
Rule Learning
Run-Time System
Rule Refinement
Morphology
Translation Correction Tool
Word-Aligned Parallel Corpus
INPUT TEXT
Run Time Transfer System
Rule Refinement Module
Elicitation Corpus
Decoder
Elicitation Tool
Lexical Resources
OUTPUT TEXT
15,000 Noun Phrases from Penn TreeBank
Supported by DARPA TIDES
81
Manual Transfer Rules Example
NP PP NP1 NP P Adj N
N1 ke eka aXyAya N
jIvana
NP NP1 PP Adj N
P NP one chapter of N1
N life
NP1 ke NP2 -gt NP2 of NP1 Ex jIvana ke
eka aXyAya life of (one) chapter
gt a chapter of life NP,12 NPNP PP
NP1 -gt NP1 PP ( (X1Y2) (X2Y1) ((x2
lexwx) 'kA') ) NP,13 NPNP NP1 -gt
NP1 ( (X1Y1) ) PP,12 PPPP NP Postp
-gt Prep NP ( (X1Y2) (X2Y1) )
82
Hindi-English
Very miserly training data. Seven combinations of
components Strong decoder allows
re-ordering Three automatic scoring metrics

Write a Comment

User Comments (0)