AMTEXT:%20Extraction-based%20MT%20for%20Arabic - PowerPoint PPT Presentation

About This Presentation
Title:

AMTEXT:%20Extraction-based%20MT%20for%20Arabic

Description:

Different notion of rule generalization than in our full XFER approach ... Peres, meluve b-sar ha-xuc shalom, yipagesh im bush hayom. Peres will meet with Bush today ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 10
Provided by: AlonL
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: AMTEXT:%20Extraction-based%20MT%20for%20Arabic


1
AMTEXTExtraction-based MT for Arabic
  • Alon Lavie, Jaime Carbonell
  • Language Technologies Institute
  • Carnegie Mellon University
  • Email alavie,jgc_at_cs.cmu.edu
  • Project Members
  • Laura Kieras, Peter Jansen
  • Informant
  • Loubna El Abadi

2
Objective
  • Develop a framework for high-accuracy MT of
    extracted entities, objects and their
    relationships, which is
  • Rapidly portable and adaptable to new source
    languages
  • Easily expandable to new types of entities and
    relationships

3
AMTEXT Approach
  • Develop an elicitation corpus specifically
    designed for targeted extraction patterns
  • Learn generalized transfer rules for targeted
    extraction patterns from elicitation corpus
  • Acquire high accuracy Named-Entity translation
    lexicon limited translation lexicon for
    targeted vocabulary
  • Runtime use partial parser transfer rules to
    translate only the matched portions of SL text

4
Elicitation Example
5
Learning Transfer Rules
  • Different notion of rule generalization than in
    our full XFER approach
  • Generalize from examples to NEs that play
    specific roles in target extraction pattern
  • Verbs and function words may not be generalized
  • Example

Peres will meet with Bush today peres yipagesh
im bush hayom
Goal Rule
SS NE-P yipagesh im NE-P TE -gt NE-P will
meet with NE-P TE((X1Y1) (X4Y5) (X5Y6))
6
Partial Parsing
  • Input Full text in the foreign language
  • Output Translation of extracted/matched text
  • Goal Extract by effectively matching transfer
    rules with the full text
  • Identify/parse NEs and words in restricted
    vocabulary
  • Identify transfer-rule (source-side) patterns
  • Handle expected high-levels of ambiguity

Peres, meluve b-sar ha-xuc shalom, yipagesh im
bush hayom
NE-P
NE-P
NE-P
TE
Peres will meet with Bush today
7
Input/Output
  • Input
  • Full text in source language (Arabic)
  • Output
  • English translation of extracted entities and
    relationships
  • (Possibly also a structured representation)

????? ????? ????? ?????? ?????? ???? ???? ????
????? ????? ????? ??? ????? ??????? ?????? ?????
?? ???? ???????? ?????? ??????? ?????? ???????
?? ??????? ??????? ????? ?? ???? 23 ???? ??????
300 ?????. ???? ?????? ?????? ???? ?? ???????
???????? ??????? ???????? ?? ???? ????? ??????.
The Abu Hafz al-Masri Brigades - al-Qaida warned
car bombs killed 23 people injured
300 others
AMTEXT System
8
Scope of Pilot System
  • Arabic-to-English
  • Newswire text (available from TIDES)
  • Limited set of actions (X meet Y) (X attend Y)
    (X hold Y) (X kill Y) (X announce Y)
  • Limited translation patterns
  • ltsubj-NEgt ltverbgt ltobjgt ltLOCgt ltTEgt
  • Limited vocabulary

9
Evaluation Plan
  • Compare AMTEXT approach to full-text
    Arabic-to-English SMT, on a limited task of
    translation of relations within the scope of
    coverage
  • Establish a test set for evaluation
  • Define an appropriate metric Precision/Recall/F1
    of relations and entities
  • Compare performance
Write a Comment
User Comments (0)
About PowerShow.com