Title: Searching Indexed Bilingual Knowledge Banks
1Searching Indexed Bilingual Knowledge Banks
2Outline
- Introduction
- Proposed Method
- Summary
3Introduction
- Types of machine translation
- - Rules-based machine translation
- - Knowledge-based machine translation
- - Statistical-based machine translation
- - Example-based machine translation
- (EBMT)
4EBMT Architecture
- Find the best matching in the bilingual corpus
5EBMT Architecture (cont.)
6EBMT Architecture (cont.)
7EBMT Architecture (cont.)
Recombination
8Examples of EBMT
- Gaijin System - Uses a bilingual lexicon and
transfer rules - MSR-MT - Uses MindNet, logical form
9Weaknesses
10Weaknesses
- a finite means for generating the potential
infinity of linguistic forms a speaker-hearer can
produce or recognize (Chomsky,1928)
11Weaknesses
30k of S-SSTC
S-SSTC
- The size of BKB is big, it will take time to
perform searching. The graph complexity is
exponential, O(n)en
S-SSTC
S-SSTC
S-SSTC
S-SSTC
S-SSTC
S-SSTC
S-SSTC
S-SSTC
S-SSTC
S-SSTC
12Weaknesses
- Poor classification of S-SSTC in the Bilingual
knowledge Bank (BKB)
S-SSTC
S-SSTC
S-SSTC
S-SSTC
S-SSTC
S-SSTC
S-SSTC
S-SSTC
S-SSTC
13Goal and Objective
- Classification of STREE and SNODE correspondence
in BKB for effective retrieval and translation
14Proposed System Design
15Index Construction
16Indexed BKB
Indexed S-SSTC
Indexed BKB
Indexed S-SSTC
Indexed S-SSTC
Indexed S-SSTC
Indexed S-SSTC
Indexed S-SSTC
17Pivot BKB
- Pivot BKB
- - clustering the S-SSTC
- - categorize the SNODE, STREE based
- on pattern-based POS
- - use the modification of inverted file
- indexing
18Pivot BKB (cont.)
IP seeV theDet bigAdj bird N ??sayaP
lihatV burungN besarAdj ituDet
SNODE CORRESPONDENCE 1.5.1 I P ??saya P
1.5.2 see V ??lihatV 1.5.3 the Det
??itu Det 1.5.4 big Adj
??besarAdj 1.5.5 bird N ??burungN
Stree CORRESPONDENCE 1.6.1 P V Det Adj
N ?? P V N Adj Det
I see the big bird ??saya lihat burung
besar itu 1.6.2 V Det Adj N ?? V
N Adj Det see the big
bird ?? lihat burung besar itu 1.6.3 Det Adj
N ?? N Adj Det the big
bird ?? burung besar itu
Name of Indexed BKB a.xml
19Pivot BKB (cont.)
Name of Indexed BKB
Sense of SNODE
ltPgt
ltVgt
- I ,lta.xml1.5.12gt
- see ,lta.xml1.5.22gt
Index of SNODE
ltDetgt
ltAdjgt
- the ,lta.xml1.5.31gt
- big ,lta.xml1.5.44gt
ltNgt
Name of Indexed BKB
- bird ,lta.xml1.5.53gt
ltP V Det Adj Ngt
ltV Det Adj Ngt
- I see the big bird ,lta.xml1.6.1gt
- see the big bird ,lta.xml1.6.2gt
Index of STREE
ltDet Adj Ngt
- the big bird ,lta.xml1.6.3gt
20Pivot BKB (cont.)
theDet oldAdj manN walked V ?? orangN
tuaAdj iniDet berjalanV
SNODE CORRESPONDENCE 2.5.1 theDet ?? ini Det
2.5.2 oldAdj ?? tua Adj 2.5.3 manN ??
orang N 2.5.4 walked V ?? berjalan V
Stree CORRESPONDENCE 2.6.1 Det Adj N V
?? N Adj Det V the old man
walked ?? orang tua ini berjalan 2.6.2 Det
Adj N ?? N Adj Det the old
man ?? orang tua ini
Name of Indexed BKB a.xml
21Pivot BKB (cont.)
ltPgt
ltVgt
- I ,lta.xml1.5.12gt
- see ,lta.xml1.5.22gt
- walked ,lta.xml2.5.41gt
ltDetgt
ltAdjgt
- the ,lta.xml1.5.31gt
- big ,lta.xml1.5.44gt
- the ,lta.xml2.5.12gt
- old ,lta.xml2.5.21gt
ltNgt
- bird ,lta.xml1.5.53gt
- man ,lta.xml2.5.32gt
ltP V Det Adj Ngt
ltV Det Adj Ngt
- I see the big bird ,lta.xml1.6.1gt
- see the big bird ,lta.xml1.6.2gt
ltDet Adj Ngt
ltDet Adj N Vgt
- the big bird ,lta.xml1.6.3gt
- the old man walked ,lta.xml2.6.1gt
- the old man ,lta.xml2.6.2gt
22Summary
- size(n)?1/speed(n)
- Classification and indexing to perform effective
retrieval and translation