Title: MACHINE BRIDGE FOR SOUTH INDIAN LANGUAGES - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

Title: MACHINE BRIDGE FOR SOUTH INDIAN LANGUAGES

Description:

... of similarity in their form being cognate languages.Similarities can be found in ... It will be independent of the language. ... – PowerPoint PPT presentation

Number of Views:56
Avg rating:3.0/5.0
Slides: 17
Provided by: win1313
Category:

less

Transcript and Presenter's Notes

Title: Title: MACHINE BRIDGE FOR SOUTH INDIAN LANGUAGES


1
  • Title MACHINE BRIDGE FOR SOUTH INDIAN LANGUAGES
  • Proposers
  • 1. Dr.S.Rajendran
  • Head, Department of Linguistics
  • Tamil University
  • Thanjavur 613 005
  • 2. Dr.S.Baskaran
  • Head, Department of Computer Science
  • Tamil University, Thanjavur 613 005
  • Language Pair Tamil to Malaylam, Telugu, and
    Kannda

2
  • Components that will be implemented
  • Transliteration
  • POS Tagging Engine
  • Chunking Engine
  • Morphological Analyzers Generators
  • Bilingual (Multi-lingual) dictionary
  • Transfer Engine
  • Annotated Corpora Preparation and Validation
  • Evaluation of MT system

3
  • Tamil Malayalam, Kannada and Telugu
  • Transliteration
  • The languages show certain amount of similarity
    in their form being cognate languages.Similarities
    can be found in the case of borrowed items also.
  • Interstate governmental communication and
    documentation

4
  • Tamil Malayalam, Kannada and Telugu
  • POS Tagging Engine
  • It will be independent of the language. Hidden
    Markov models (HMMs) would be used for building
    tagging engine.
  • Information given by Morphological analyzer will
    be used for POS tagging.
  • 85 going up to 95
  • Interstate governmental communications and
    documentation

5
  • Tamil Malayalam, Kannada and Telugu
  • Chunking Engine
  • Same as the POS tagging engine and will be based
    on HMMs. Shallow parsing techniques will be
    implemented.
  • 85 going up to 95
  • Interstate governmental communications and
    documentation

6
  • Tamil Malayalam, Kannada and Telugu
  • Morphological Analyzers Generators
  • Finite state automata will be used to do parsing
    and generation.
  • 85 going upto 95.
  • Interstate governmental communication and
    documents.

7
  • Tamil Malayalam, Kannada and Telugu
  • Bilingual (Multi-lingual) dictionary
  • Tamil-Malayalam, Tamil-Kannada, and Tamil-Telugu
    translation dictionaries will be prepared.
  • 85 going up to 95
  • Interstate governmental communications and
    documentation

8
  • Tamil Malayalam, Kannada and Telugu
  • Transfer Engine
  • Lexical Transfer grammar will be used.
  • 85 going up to 95
  • Interstate governmental communications and
    documentation

9
  • Tamil Malayalam, Kannada and Telugu
  • Annotated Corpora Preparation and Validation
  • Automatic annotation followed by manual
    annotation. Training techniques will also be
    implemented.
  • 85 going up to 95
  • Interstate governmental communications and
    documentation

10
  • Tamil Malayalam, Kannada and Telugu
  • Evaluation of MT system
  • Proper evaluation techniques will be evolved in
    course of time.

11
  • THANK YOU

12
  • Language pair Tamil Malayalam, Kannada, Telugu
  • Name of the component
  • POS Tagging Engine
  • Techniques that will be used
  • Estimation of the expected performance
  • Preferred domain of the performance
  • Other evaluation matrices

13
  • Language pair Tamil Malayalam, Kannada, Telugu
  • Name of the component
  • POS Tagging Engine
  • Techniques that will be used
  • Estimation of the expected performance
  • Preferred domain of the performance
  • Other evaluation matrices

14
  • Language pair Tamil Malayalam, Kannada, Telugu
  • Name of the component
  • POS Tagging Engine
  • Techniques that will be used
  • Estimation of the expected performance
  • Preferred domain of the performance
  • Other evaluation matrices

15
  • Language pair Tamil Malayalam, Kannada, Telugu
  • Name of the component
  • POS Tagging Engine
  • Techniques that will be used
  • Estimation of the expected performance
  • Preferred domain of the performance
  • Other evaluation matrices

16
  • Language pair Tamil Malayalam, Kannada, Telugu
  • Name of the component
  • POS Tagging Engine
  • Techniques that will be used
  • Estimation of the expected performance
  • Preferred domain of the performance
  • Other evaluation matrices
Write a Comment
User Comments (0)
About PowerShow.com