NGrams Conflation Approach for Arabic Text - PowerPoint PPT Presentation

1 / 3
About This Presentation
Title:

NGrams Conflation Approach for Arabic Text

Description:

... Workshop on Improving Web retrieval for non-English Queries. 2 ... Matching non-identical words that refer to the same principle concept. Why is it important? ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 4
Provided by: Sunny82
Category:

less

Transcript and Presenter's Notes

Title: NGrams Conflation Approach for Arabic Text


1
N-Grams Conflation Approach for Arabic Text
  • Farag Ahmed, and Andreas NürnbergerOtto-von-Gueri
    cke-University of Magdeburg
  • http//irgroup.cs.uni-magdeburg.de

fahmed_at_iws.cs.uni-magdeburg.de
2
N-gram conflation technique
  • GoalStudy a conflation method based on n-gram
    approach with some enhancements and evaluate its
    performance in Arabic text.
  • What is conflation method?Matching non-identical
    words that refer to the same principle concept.
  • Why is it important?Avoid the strong dependences
    on the exact users query.
  • The problem
  • Document Information need


  • Query Problem

Word form variations whose English translation
contain the word student or students ?????
??????? ???????? ???????? ????????? ??????
??????? ??????? ?????? ???????? ??????? ????????
??????? ????????Feminine
?????? ??????? ??????? ????????
??????? ????????? ??????? ???????? ????????
????????? ????????? ???????? ???????? ?????????
???????? ????????? ??????? ???????? ???????
???????? ?????? ?????????. ???? ?????? ???????
??????? ???????? ????? ?????? ?????? ?????
??????? ?????? ??????? ?????? ??????? Masculine
????? ?????? ??????
??????? ?????? ???????? ????? ?????? ??????
??????? ???????? ??????? ?????? ??????? ??????
??????? ?????? ??????? ????? ?????? ???? ????????
  • ???? ??????? ?? ?????? ??????? ??????
    xx-xx?????? ?????? ??? ??????? "???????????
    ????? ????? ??????? ????????.
  • (Samsung released it is new mobile xx-xx that
  • support office And multimedia applications)

?? ?? ?????? ??????? ?????? ???? ????? ????
??????? ? (What is the new mobile that
Samsung released?)
???? ? ????? ? ??????? Phone, Mobile, Samsung
No term in common between the query and the
document therefore traditional IR system will not
consider this document as a relevant document.
3
Revised n-gram
Computing Similarity Scores Based on N-Grams

(1) Computing Similarity Scores
Based on revised N-Grams



(2)
The revised approach enhance the similarity score
measures ????????? (the Alliances) and ??????
(the Conqueror).
Figure 2. the similarity score using revised
bigram is 28.57
Figure 1. the similarity score using pure
bigram is 85.72
Write a Comment
User Comments (0)
About PowerShow.com