Title: Presentation Outline
1(No Transcript)
2Presentation Outline
- Structured String-Tree Correspondence (SSTC)
- Synchronous Structured String-Tree Correspondence
(SSTC)
- EBMT based on synchronous SSTC
- The Construction of a BKB Based on the
Synchronous SSTC
- Bitext World-level Mapping (Word Alignment)
- Bitext Synchronous Parsing Technique
3The Structured String-Tree Correspondence (SSTC)
SSTC string arbitrary tree structure
correspondence
Correspondence node(X/Y)
4(No Transcript)
5(No Transcript)
6(No Transcript)
7(No Transcript)
8Example-Based Machine Translation (EBMT)
EBMT is the case-based reasoning approach to MT
EBMT uses translated examples of similar
sentences to translate a given Source sentence
into the target sentence.
9The general Architecture for EBMT
10EBMT based on synchronous SSTC.
Different senses for the word bank bank 1 a
land beside the river. bank 2 a place to keep
money. E.g The1 man2 keep1 his1 money1 in1 the1
bank2.
Replacement Combination
11Source sentence The old man picks the green lamp
up
12Set of synchronous SSTCs represents Example-base.
English sentence The lamp is off. Malay
translation Lampu itu padam.
13(No Transcript)
14Source the old man picks the green lamp up
15Sub-synchronous SSTCs for the source sentence
16Selected closed example
Sub-synchronous SSTCs derived from the example
17(No Transcript)
18(No Transcript)
19(No Transcript)
20lelaki tua itu kutip lampu hijau itu
Generation
The translation for the source sentence is
generated from the synchronous SSTC the Malay
part, which is the String in the SSTC.
21EBMT General Problems
- How to utilize more than one example to translate
one source sentence
The construction of well-formed target language
sentences from extracted fragments of a BKB.
- lack of flexibility in representing translation
relations between source and target substrings
The treatment of wild linguistic phenomena, which
are non-standard, e.g. crossed dependencies
22(No Transcript)
23(No Transcript)
24(No Transcript)
25- The Construction of a BKB Based on the
Synchronous SSTC
Based on Bitext Synchronous Parsing Technique
- BiText Text that is available in two languages.
26Parsing POS Tagging for the English source text
Build the SSTC for Malay target text based on the
SSTC for the English source text using the word
alignment
Compile the APP output into SSTC for the English
source text
27(No Transcript)
28Bitext World-level Mapping (Word Alignment)
Real texts are noisy - Fertility A single word
in the source sentence may correspond to zero,
one, two or more words in the target sentence and
vice versa.
- crossed dependencies (distortion) Where human
translators change and rearrange material so the
target output text will not flow well according
to the order of the source text.
29(No Transcript)
30(No Transcript)
31n Context Window Word Alignment
The correspondence between the source and the
target is denoted by an interval attached to each
subtext according to its offset in the text.
32n Context Window Word Alignment
Find the TPCs between the source and the target.
?(Bilingual dictionary)
Bilingual dictionary
33n Context Window Word Alignment
Find out the chains for all possible TPCs for a
source word.
34n Context Window Word Alignment
35- Bitext Synchronous Parsing Technique
36(No Transcript)
37Apple Pie Parser (APP)
- It is a bottom-up probabilistic chart parser to
find the parse tree for an input text (English).
- It was developed at New York University.
- The parser generates a syntactic tree in
PennTreeBank bracketing.
- It is Free, and available to download with the
source code.
- http//cs.nyu.edu/cs/projects/proteus/sekine
38Apple Pie Parser (APP)
The basic idea of example-based parsing is very
simple
The representation structure and the POS for the
source English is obtained
39(No Transcript)
40Compile the APP output to SSTC structure
(S (NP (NPL The basic idea) (PP of (NPL
example-based parsing))) (VP is (ADJP very
simple)))
41Lexical Transfer
42(No Transcript)
43The synchronous SSTC editor.
44Discussion
Thank you..