Title: Example Based Machine Translation
1Example Based Machine Translation
Indian Institute of Science
2Example Based Machine Translation
- A Bilingual Dictionary was used for this
Purpose. - The Bilingual dictionary is a set of 3
dictionaries comprising of - Sentence Dictionary
- Phrases Dictionary
- Words Dictionary
3Example Based Machine Translation
Ebmt Flow Diagram
2. Socket Connections
1.Request
Java Servlet Apache Tomcat
Client Browser
Machine Translation Programs running as
servers. One server for one Translation system.
4.Response
3. Result of Processing
4Example Based Machine Translation
Hindi Corpora
- Sentence Dictionary - 1900 Sentences
- Phrases Dictionary - 35000 Phrases
- Word Dictionary - 34000 Words
-
-
5Example Based Machine Translation
Tamil Corpora
- Sentence Dictionary - 40000 Sentences
- Phrases Dictionary - 28490 Phrases
- Word Dictionary - 52400 Words
-
-
6Example Based Machine Translation
Kannada Corpora
- Sentence Dictionary - 40000 Sentences
- Phrases Dictionary - 12480 Phrases
- Word Dictionary - 25000 Words
-
-
7Example Based Machine Translation
- More than around 400 rules added in Hindi, tamil
and Kannada - Performance evaluated for good- enough
translation
8Example Based Machine Translation
Machine Evaluation
Aim To estimate the improvement in
performance of EBMT system due to addition of
rules and increasing the size of the corpora
The Methodology BLEU BiLingual Evaluation
Understudy, from IBM Research Lab
9BLEU Score Results -Tamil
Where, 1 ? Performance of EBMT system without
Rules 2 ? Performance of EBMT system with 500
rules
10 BLEU Score Results Tamil (Contd)
Where, 1 ? Performance of EBMT system without
Rules 2 ? Performance of EBMT system with 500
rules
11BLEU Score Results - Kannada
Where, 1 ? Performance of EBMT system without
Rules 2 ? Performance of EBMT system with 452
rules
12 BLEU Score Results Kannada(Contd)
Where, 1 ? Performance of EBMT system without
Rules 2 ? Performance of EBMT system with 452
rules
13Manual Evaluation
Aim To demonstrate Human
assisted good enough Translation
- The Methodology
- A test bed of sentences along with its
translation, which is done by machine is given to
the user. For each sentence, the user will give
ranking in the following range - Useless
- Poor
- Mediocre
- Acceptable
- Human Quality
14Example Based Machine Translation
Manual Evaluation (Contd)
The Need Even though machines are evaluating
Machine Translation at large extent they give
credits only based on rules and their evaluation
is not up to the mark because
Machines cant understand context
Machines cant tolerate typographical errors
15Manual Evaluation Screen Shot I
16Manual Evaluation Screen Shot II
17Manual Evaluation Results
18 EBMT Result 1
19 Ebmt -Result 2
20Ebmt - Result 3
21EBMT- Our Proposal
- It has been planned to increase the words,
phrases and sentences data base from its present
level by adding 25,000 more words, phrases and
sentences in Tamil and Kannada. - To write programmes to mine the web and to deduce
rules from the parallel corpora. - To design an interface for the users to test out
the machine translation system and augment it
through suggestions using the Wikipedia approach.
- CLICK HERE TO VIEW THE URL