METIS

About This Presentation

Title:

METIS

Description:

to obtain free text translations of reasonably high linguistic quality from ... by incorporating translation equivalence information at lemma and structure ... – PowerPoint PPT presentation

Number of Views:75

Avg rating:3.0/5.0

Slides: 35

Provided by: webpag4

Category:

more less

Transcript and Presenter's Notes

Title: METIS

1
METIS

STATISTICAL MACHINE TRANSLATION USING MONOLINGUAL
CORPORA

2
METIS-AUTHORS

IOANNIS DOLOGLOU
STELLA MARKANTONATOU
GEORGE TAMBOURATZIS
OLGA YANNOUTSOU
ATHANASSIA FOURLA
NIKOS IOANNOU

3
Structure of the presentation

The idea
Translation Equivalence Information
Description of the system
Assessment

4
GOALS
GOALS

METIS aimed to assess the possibility
to obtain free text translations of reasonably
high linguistic quality from large annotated
monolingual corpora
with pattern-matching techniques
by incorporating translation equivalence
information at lemma and structure level (the
latter by employing tag-mapping rules)

5
TRANSLATION EQUIVALENCE INFORMATION

Tag-equivalence tables
Tagsets used
English CLAWS5 (on BNC)
Greek ILSP-PAROLE (on HNC)
Dutch CGN tags (on Corpus Spoken Dutch (CGN) and
the Eindhoven corpus)

6
(No Transcript)
7
(No Transcript)
8
TRANSLATION EQUIVALENCE INFORMATION

Tag-mapping rules
These are rules which map the input structure
onto an abstract one which is closer to the
structure of the translation sought

9

10
TRANSLATION EQUIVALENCE INFORMATION

Appropriate bilingual lexica
These are lexica that provide multiple
translations of lemmata, expressions and PoS
information for both languages

11
(No Transcript)
12
The System

System Requirements in Resources
Bilingual lexicon file with PoS information
Tag-mapping Rules file
Tagged and Lemmatized source language sentence
Tagged and Lemmatized target language corpus
Weights file

13
System Operation (1)
Bilingual Lexicon
SS
TS1
1.Lemma to Lemma Translation
SS
Rules
TS2
correspondence
TS1
2. Rule Application
14
System Operation (2)
TS2
Weights File
CORPUS
Corpus TS
3. Corpus Search
15
Example Sentence
Source Sentence SS

Actual translation
The woman peels the apple
OR
The woman is peeling the apple

16
1. Lemma to lemma translation
Target language sentence (lemma to lemma
translated) TS1
and term correspondence, e.g. o/At corresponds
to the/AT?
17
2. Rules application (1)
Example rule 1\\VbMnIdPr______IpAv__
- 1\\VVB-VVZ \be\VBB-VBZ1\\VVG
18
2. Rules application (2)
1\\VbMnIdPr______IpAv__
correspondence
1\\VVB-VVZ \be\VBB-VBZ1\\VVG
19
2. Rules application (4)
Target language sentence after rules application
TS2
20
3. Sentence Comparison
The system examines all the sentences in the
corpus and finds the one with the highest
similarity percentage.
21
ASSESSMENT
22
Main idea to facilitate post-editing according
to users preferences

A lot of discussion has taken place among
professional translators as to what type of
errors are easier to correct e.g. grammatical
corrections are easier to correct than semantic
ones of course this is to a certain extent
highly subjective therefore criteria are not set.

23
RANK-BASED EVALUATION

Solution Experimental Set-up
Select a given corpus of sentences (in the source
language) and for each of them provide a corpus
of translations (in the target language).
Use a (group of) human translator(s) to rank the
target corpus in terms of suitability of each
sentence as the translation of a source sentence.
Rank all target corpus sentences according to
their suitability as translations.

24
RANK-BASED EVALUATION

Solution Experimental Set-up
Provide all target sentences as input to the
METIS system and allow METIS to rank them as
potential translations.
Compare the rankings of the target-corpus
sentences according to (i) METIS and (ii) the
group of translators, generating a measure of the
correspondence between the two rankings.
Vary the values of system parameters to fine-tune
the system response.

25
Requirements

typology of errors
penalty of errors according to difficulty of
correctability.
target corpus with ranked sentences according to
degree of similarity to the source sentence i.e.
number of errors and final score

26
EXAMPLE OF TARGET CORPUS

Source sentence
? ???a??a ?a?a???e? t? µ???.
(The woman is peeling the apple)
Target corpus
The woman is peeling the apple. (class A)
The woman cleans the apple. (class A)
The woman has been peeling the apple. (class A)
The woman peeled the apple. (class B)
The woman has been washing the apple (class C)

27
RANK-BASED EVALUATION
28
The benchmarking corpus for assessment

Hellenic National Corpus (HNC) as a source corpus
British National Corpus (BNC) as a target corpus
Construction of a toy corpus (S/T) for dealing
with specific phenomena

29
The benchmarking corpus for assessment (cont.)

Phenomena studied valency, impersonal, copular
and ergative verb phrases, agreement (3rd
singular/plural, as most common), word order,
tense and aspect, subordinate clauses, sentence
types, sentence construction, determiner phrases,
modifiers in different positions, agreement
between adjectives and nouns, degrees of
adjectives and adverbs and sentential order of
adverbs, definite/indefinite article, the
structure I like (µ?? a??se?), clitics, and
possessives.
Phenomena treated tenses, the structure I
like, clitics, possessives, definite/indefinite
articles passive voice.

WHERE TO GO
Integrate generation at morphological level
Break the sentence barriers --- additional
generation capacity perhaps
Integrate lexical semantic information (wordnets,
semantic distances)

CALCULATING THE DISTANCE BETWEEN METIS HUMAN
TRANSLATORS
If the ranking of the given sentence is the same,
no penalty is imposed and the score is 0.
A sentence scores 0, when all sentences ranked by
translators above it remain above it in the METIS
ranking and all sentences ranked by translators
below it remain below it in the Metis ranking.
If a sentence belongs to class X and is ranked
higher than certain sentences of a higher class,
then it is penalised. The penalty score is equal
to the number of sentences of a higher ranking
that it has overtaken.

CALCULATING THE DISTANCE BETWEEN METIS HUMAN
TRANSLATORS (contd.)
A sentence belonging to class X will be
penalized, if a sentence belonging to a lower
category has overtaken it. The score is equal to
the number of sentences that have overtaken it.
If a sentence belonging to class X has been both
(i) overtaken by sentences of a lower category
and (ii) has overtaken sentences of a higher
category, then the total penalty is equal to the
sum of the penalties for cases (i) and (ii), as
defined above.

33
Present continuous
34
Results of the experiments with METIS

Present continuous
(The woman is peeling the apple)
The Greek present tense corresponds to the
English simple present, present continuous and
present perfect.
The lowest penalty (8) was achieved with the use
of tag mapping rules for the present tense
correspondence.

Write a Comment

User Comments (0)

About PowerShow.com

METIS - PowerPoint PPT Presentation

METIS

to obtain free text translations of reasonably high linguistic quality from ... by incorporating translation equivalence information at lemma and structure ... – PowerPoint PPT presentation