Compact%20WFSA%20based%20Language%20Model%20and%20Its%20Application%20in%20Statistical%20Machine%20Translation PowerPoint PPT Presentation

presentation player overlay
About This Presentation
Transcript and Presenter's Notes

Title: Compact%20WFSA%20based%20Language%20Model%20and%20Its%20Application%20in%20Statistical%20Machine%20Translation


1
Compact WFSA basedLanguage Model and Its
Application in Statistical Machine Translation
  • Xiaoyin Fu, Wei Wei, Shixiang Lu,
  • Dengfeng Ke, Bo Xu
  • Interactive Digital Media Technology Research
    Center, CASIA

2
Outline
  • Task
  • Problems
  • Solution
  • Our Approach
  • Results
  • Conclusion

3
Outline
  • Task
  • Problems
  • Solution
  • Our Approach
  • Results
  • Conclusion

4
Task
  • N-gram Language Model
  • assign probabilities to string of words or tokens
  • Let wL denote a string of L tokens over a fixed
    vocabulary
  • smoothing techniques
  • back-off
  • Define

5
Outline
  • Task
  • Problems
  • Solution
  • Our Approach
  • Results
  • Conclusion

6
Problems
  • Query in trie structure
  • Useless queries
  • Problems in Forward Query
  • Problems in Back-off Query

7
Outline
  • Task
  • Problems
  • Solution
  • Our Approach
  • Results
  • Conclusion

8
Solution
  • Another point of view
  • a random procedure
  • a continuous process
  • Benefit
  • Speed up Forward Query
  • Speed up Back-off Query
  • Goal
  • Fast
  • Compact

9
Outline
  • Task
  • Problems
  • Solution
  • Our Approach
  • Results
  • Conclusion

10
Our Approaches
  • FAST
  • WFSA
  • 5-turple M(Q, S, I, F, d )
  • Definition

Q a set of states
I a set of initial states
F a set of final states
S a alphabet which represents the input and output labels
d d Q(S?e), a transition relation
11
Our Approaches
  • FAST
  • WFSA
  • 5-turple M(Q, S, I, F, d )
  • Example

Q a set of states
I a set of initial states
F a set of final states
S a alphabet which represents the input and output labels
d d Q(S?e), a transition relation
12
Our Approaches
  • Compact
  • Trie
  • Sort Array

13
Our Approaches
  • Compact
  • Trie
  • Sort Array
  • Link index

14
Our Approaches
  • WFSA-based LM
  • Trie structure
  • Note
  • Tf triggers corresponding to forward query
  • Tb triggers spontaneously without any input
  • reaches to the leaves
  • carries out back-off queries

Q the nodes in trie
I the root of trie
F Each node of trie except the root
S the alphabet of input sentences
d forward transition Tf and roll-back transition Tb
15
Our Approaches
  • WFSA-based LM

16
Our Approaches
  • WFSA-based LM

Probability Back-off Index
Probability Back-off Index Roll-back index
17
Our Approaches
  • WFSA-based LM

Probability Back-off Index
Probability Back-off Index Roll-back index
Cross Layer
18
Our Approaches
  • Query Method

19
Our Approaches
  • Query Method

20
Our Approaches
  • Query Method

21
Our Approaches
  • Query Method

22
Our Approaches
  • Query Method

23
Our Approaches
  • Query Method

24
Our Approaches
  • Query Method

25
Our Approaches
  • Query Method

26
Our Approaches
  • Query Method

27
Our Approaches
  • State Transitions

28
Our Approaches
  • Query LM


29
Our Approaches
  • For HPB SMT
  • For a source sentence
  • A huge number of LM queries
  • Ten Millions
  • Most of these are repetitive
  • Hash cache

30
Our Approaches
  • For HPB SMT
  • Hash cache
  • Small fast
  • Hash size 24bit
  • 16M
  • Simple operation
  • Additive Operation
  • Bitwise Operation
  • Hash clear
  • For each sentence

31
Outline
  • Task
  • Problems
  • Solution
  • Our Approach
  • Results
  • Conclusion

32
Results
  • Setup
  • LM Toolkit SRILM
  • Decoder Hierarchical phrase-based translation
    system
  • Test data IWSLT-07(489) NIST-06(1664)
  • Training data

Tasks Model Parallel sentences Chinese words English words
IWSLT-07 TM1 0.38M 3.0M 3.1M
IWSLT-07 LM2 1.3M 15.2M
NIST-06 TM3 3.4M 64M 70M
NIST-06 LM4 14.3M 377M


1 The parallel corpus of BTEC (Basic Traveling
Expression Corpus) and CJK (China-Japan-Korea
corpus) 2 The English corpus of
BTECCJKCWMT2008 3 LDC2002E18, LDC2002T01,
LDC2003E07, LDC2003E14, LDC2003T17, LDC2004T07,
LDC2004T08, LDC2005T06, LDC2005T10, LDC2005T34,
LDC2006T04, LDC2007T09 4 LDC2007T07
33
Results
  • Storage Space
  • The storage sizes increase about 35
  • Linearly dependent with the nodes of trie
  • Acceptable

The comparison of LM size between SRILM and WFSA
Tasks n-grams SRILM (Mb) WFSA (Mb) ? ()
IWSLT-07 4 65.7 89.1 35.6
IWSLT-07 5 89.8 119.5 33.1
NIST-06 4 860.3 1190.4 38.4
NIST-06 5 998.5 1339.7 34.2
34
Results
  • Query Speed
  • WFSA
  • 60 in 4-grams
  • 70 in 5-grams
  • WFSAcache
  • Speed up by 75

n-grams methods IWSLT-07(s) NIST-06(s)
4 SRILM 163 15433
4 WFSA 70 6251
4 WFSAcache 42 3907
5 SRILM 261 25172
5 WFSA 85 7944
5 WFSAcache 59 6128
35
Results
  • Analysis
  • Repetitive queries and back-off queries in SMT
  • 4-gram
  • back-off queries are widely existed
  • most of these queries are repetitive
  • WFSA based LM can speed up queries effectively

Tasks Back-off Repetitive
IWSLT-07 60.5 95.5
NIST-06 60.3 96.4
36
Outline
  • Task
  • Problems
  • Solution
  • Our Approach
  • Results
  • Conclusion

37
Conclusion
  • A faster WFSA-based LM
  • Faster forward query
  • Faster back-off query
  • A compact WFSA-based LM
  • Trie structure
  • A simple caching technique
  • For SMT system
  • Other fields
  • Speech recognition
  • Information retrieval

38
Thanks!
Write a Comment
User Comments (0)
About PowerShow.com