Automatic Idiom Recognition - PowerPoint PPT Presentation

1 / 6
About This Presentation
Title:

Automatic Idiom Recognition

Description:

Evaluate the different grammatical models in idiom detection. Process & Evaluation ... For a given corpus, find all instances of idiom I and its variants. ... – PowerPoint PPT presentation

Number of Views:148
Avg rating:3.0/5.0
Slides: 7
Provided by: csUal
Category:

less

Transcript and Presenter's Notes

Title: Automatic Idiom Recognition


1
Automatic Idiom Recognition
  • Ajit Paul Singh
  • Dept. of Computing Science
  • University of Alberta

2
Motivation
  • Goal Automatically tag idioms in English
    language text
  • Why ?
  • e.g. Break the ice
  • MT Literal translation loses meaning
  • Info. Retrieval Has nothing to do with ice
  • Malapropisms Idioms look like them but are
    valid phrases (Hirst)
  • Word-sense disambiguation

Reference 1
3
Approaches
  • Statistical
  • e.g. mutual information
  • Grammatical (rules to detect idioms)
  • HPSG encodings (Erbach 1992, Riehemann 1997,
    2001)
  • Link grammars (Sleator Temperley 1991)
  • Probabilistic CFGs

References 2,3,4,5
4
Proposal
  • Examine supervised learning of grammatical models
    from tagged corpora
  • North American News Text Corpora (? 415m words,
    newspaper articles)
  • Penn Treebank
  • Evaluate the different grammatical models in
    idiom detection

5
Process Evaluation
  • How to learn/validate grammatical rules
  • Input Set of idioms and examples
  • Output Grammar based description of idioms
  • Validation
  • For a given corpus, find all instances of idiom I
    and its variants.
  • Parse corpus and mark instances of idiom I and
    its variants.

6
References
  • 1 D. Lin. Automatic Identification of
    Non-compositional Phrases. Proceedings ACL-99.
    pp. 317-324
  • 2 G. Erbach. Head Driven Lexical Representation
    of Idioms in HPSG. Proceedings of Intl.
    Conference on Idioms, Tilburg (NL), 1992
  • 3 S. Riehemann. Idiomatic Constructions in
    HPSG, 1997.
  • 4 S. Riehemann. A Constructional Approach to
    Idioms and Word Formation. Thesis (Stanford CLSI,
    2001)
  • 5 D.D.K. Sleator and D. Temperley. Parsing
    English with a Link Grammar. Technical Report
    (CMU-CS-91-196)
Write a Comment
User Comments (0)
About PowerShow.com