Reordering Model Using Syntactic Information of a Source Tree for Statistical Machine Translation - PowerPoint PPT Presentation

About This Presentation
Title:

Reordering Model Using Syntactic Information of a Source Tree for Statistical Machine Translation

Description:

... information can be briefly introduced to phrase-based translation ... English-to-Japanese translation experiment. JST Japanese-English paper abstract corpus ... – PowerPoint PPT presentation

Number of Views:63
Avg rating:3.0/5.0
Slides: 30
Provided by: spNit
Category:

less

Transcript and Presenter's Notes

Title: Reordering Model Using Syntactic Information of a Source Tree for Statistical Machine Translation


1
Reordering Model UsingSyntactic Information of a
Source Treefor Statistical Machine Translation
1
2,3
  • Kei Hashimoto , Hirohumi Yamamoto ,
  • Hideo Okuma , Eiichiro Sumita ,
  • and Keiichi Tokuda

2,4
2,4
1,2
1
Nagoya Institute of Technology National
Institute of Information and Communications
Technology Kinki University ATR Spoken Language
Communication Research Labs.
2
3
4
2
Background (1/2)
  • Phrase-based statistical machine translation
  • Can model local word reordering
  • Short idioms
  • Insertions and deletions of words
  • Errors in global word reordering
  • Word reordering constraint technique
  • Linguistically syntax based approach
  • Source tree, target tree, both tree structures
  • Formal constraints on word permutations
  • IBM distortion, lexical reordering model, ITG

3
Background (2/2)
  • Imposing a source tree on ITG (IST-ITG)
  • Extension of ITG constraints
  • Introduce a source sentence tree structure
  • Cannot evaluate the accuracy of the target word
    orders
  • Reordering model using syntactic information
  • Extension of IST-ITG constraints
  • Rotation of source-side parse-tree
  • Can be briefly introduce to the phrase-based
    translation system

4
Outline
  • Background
  • ITG IST-ITG constraints
  • Proposed reordering model
  • Training of the proposed model
  • Decoding using the proposed model
  • Experiments
  • Conclusions and future work

5
Inversion transduction grammar
  • ITG constraints
  • All possible binary tree structures are generated
    from the source word sequence
  • The target sentence is obtained by rotating any
    node of the generated binary trees
  • Can reduce the number of target word orders
  • Not consider the tree structure instance

6
Imposing source tree on ITG
  • Directly introduce a source sentence tree
    structure to ITG

Source sentence tree structure
The target sentence is obtained by rotating any
node of source sentence tree structure
7
Non-binary tree
  • The parsing results sometimes produce non-binary
    trees

Any reordering of child nodes in non-binary
subtree is allowed
A
B
8
Problem of IST-ITG
  • Cannot evaluate the accuracy of the target word
    reordering
  • ? Assign an equal probability to all rotations

Equal probability
Propose reordering model using syntactic
information
9
Outline
  • Background
  • ITG IST-ITG constraints
  • Proposed reordering model
  • Training of the proposed model
  • Decoding using the proposed model
  • Experiments
  • Conclusions and future work

10
Abstract of proposed method
  • Rotation of each subtree type is modeled

Source-side parse-tree
Subtree type
Reordering probability
Reordering model using syntactic information
11
Related work 1
  • Statistical syntax-directed translation with
    extended domain of locality Liang Huang et al.
    2006
  • Extract rules for tree-to-string translation
  • Consider syntactic information
  • Consider multi-level trees on the source-side

S
NP
VP
NP
VB
12
Related work 2
  • Proposed reordering model
  • Used in phrase-based translation
  • Estimation of proposed model is independently
    conducted from phrase extraction
  • Child node reordering in one-level subtree
  • Cannot represent complex reordering
  • Reordering using syntactic information can be
    briefly introduced to phrase-based translation

13
Training algorithm (1/3)
  • Reordering model training
  • 1. Word alignment
  • 2. Parsing source sentence

1.
2.
S
source
VP
NP
target
AUX
NP
NN
DT
14
Training algorithm (2/3)
  • 3. Word alignments and source-side parse-trees
    are combined
  • 4. Rotation position is checked (monotone or
    swap)

3.
4.
15
Training algorithm (3/3)
  • 5. Reordering probability of the subtree is
    estimated by counting each rotation position
  • Non-binary subtree
  • Any orderings for child nodes are allowed
  • Rotation positions are categorized into only two
    type
  • ? Monotone or other (swap)

16
Remove subtree samples
  • Target word orders which are not derived from
    rotating nodes of source-side parse-tree
  • Linguistic reasons
  • Difference of sentence structures
  • Non-linguistic reasons
  • Errors of word alignments and syntactic analysis

17
Clustering of subtree type
  • Number of possible subtree types is large
  • Unseen subtree type
  • Subtree type observed a few times
  • ? Cannot model exactly
  • Clustering of subtree type
  • The number of training samples is less than a
    heuristic threshold
  • Estimate clustered model from the counts of
    clustered subtree types

18
Decode using proposed model
  • Phrase-based decoder
  • Constrained by IST-ITG constraints
  • Target sentence is generated by rotating any node
    of the source-side parse-tree
  • Target word ordering that destroys a source
    phrase is not allowed
  • Check the rotation positions of subtrees
  • Calculate the reordering probabilities

19
Decode using proposed model
  • Calculate reordering probability

Subtree Rotation position
monotone
swap
monotone
A
B
C
D
E
Source sentence
b
a
c
d
e
Target sentence
20
Decode using proposed model
  • Calculate reordering probability

Subtree Rotation position
swap
monotone
monotone
A
B
C
D
E
Source sentence
c
d
e
a
b
Target sentence
21
Rotation position included in a phrase
  • Cannot determine the rotation position
  • Word alignments included a phrase are not clear
  • ? Assign the higher probability, monotone or swap

Subtree Rotation position
swap
higher
higher
A
B
C
D
E
Phrase
Phrase
22
Outline
  • Background
  • ITG IST-ITG constraints
  • Proposed reordering model
  • Training of the proposed model
  • Decoding using the proposed model
  • Experiments
  • Conclusions and future work

23
Experimental conditions
  • Compared methods
  • Baseline IBM distortion, lexical reordering
    models
  • IST-ITG Baseline IST-ITG constraint
  • Proposed Baseline proposed reordering model
  • Training
  • GIZA toolkit
  • SRI language model toolkit
  • Minimum error rate training (BLEU-4)
  • Charniak parser

24
Experimental conditions (E-J)
  • English-to-Japanese translation experiment
  • JST Japanese-English paper abstract corpus

English Japanese
Training data Sentences 1.0M 1.0M
Training data Words 24.6M 28.8M
Development data Sentences 2.0K 2.0K
Development data Words 50.1K 58.7K
Test data Sentences 2.0K 2.0K
Test data Words 49.5K 58.0K
Dev. and test data single reference
25
Experimental results (E-J)
  • Proposed reordering model
  • Results of test set

Subtree sample 13M
Remove sample 3M (25.38)
Subtree type 54K
Threshold 10
Number of models 6K clustered
Coverage 99.29
Baseline IST-ITG Proposed
BLEU-4 27.87 29.31 29.80
Improved 0.49 points from IST-ITG
26
Experimental conditions (E-C)
  • English-to-Chinese translation experiment
  • NIST MT08 English-to-Chinese translation track

English Chinese
Training data Sentences 4.6M 4.6M
Training data Words 79.6M 73.4M
Development data Sentences 1.6K 1.6K
Development data Words 46.4K 39.0K
Test data Sentences 1.9K 1.9K
Test data Words 45.7K 47.0K (Ave.)
Test data 4 references
Dev. data single references
27
Experimental results (E-C)
  • Proposed reordering model
  • Results of test set

Subtree sample 50M
Remove sample 10M (20.36)
Subtree type 2M
Threshold 10
Number of models 19K clustered
Coverage 99.45
Baseline IST-ITG Proposed
BLEU-4 17.54 18.60 18.93
Improved 0.33 points from IST-ITG
28
Conclusions and future work
  • Conclusions
  • Extension of the IST-ITG constraints
  • Reordering using syntactic information can be
    briefly introduced to the phrase-based
    translation
  • Improve 0.49 points in BLEU from IST-ITG
  • Future work
  • Simultaneous training of translation and
    reordering models
  • Deal with the complex reordering which is due to
    difference of sentence tree structures

29
Thank you very much!
30
Number of target word orders
  • Number of target word orders in a target word
    sequence (binary tree)

of words IST-ITG ITG No Constraint
1 1 1 1
2 2 2 2
4 8 22 24
8 128 8,558 40,320
10 512 206,098 3,628,800
15 16,384 745,387,038 1,307,674,368,000
31
Example of subtree model
  • Monotone probability

Subtree type s
SPP,NPVP. 0.764
NPDTNNNN 0.816
VPAUXVP 0.664
VPVBNPP 0.864
NPNPPP 0.837
NPDPJJNN 0.805
Swap probability 1.0 Monotone probability
Write a Comment
User Comments (0)
About PowerShow.com