Morphology For Marathi POS-Tagger Veena Dixit 11/ 10 /2005 - PowerPoint PPT Presentation

About This Presentation
Title:

Morphology For Marathi POS-Tagger Veena Dixit 11/ 10 /2005

Description:

Marathi Morphology - definition of the task and ... Words are the orthographical strings separated by spaces and some ... is done category wise. ... – PowerPoint PPT presentation

Number of Views:135
Avg rating:3.0/5.0
Slides: 20
Provided by: Vee72
Category:

less

Transcript and Presenter's Notes

Title: Morphology For Marathi POS-Tagger Veena Dixit 11/ 10 /2005


1
MorphologyForMarathi POS-TaggerVeena Dixit
11/ 10 /2005
2
  • Contents
  • Word
  • Morphology
  • Marathi Morphology - definition of the task and
    difficulties thereto.
  • Marathi Morphology - solutions to the challenges
  • Different word classes
  • Postpositions
  • Particles
  • Interjections
  • Conjunctions
  • Pronouns
  • Adjectives
  • Adverbs
  • Verbs
  • Nouns

3
  • Words are the orthographical strings separated by
    spaces and some punctuation marks.
  • To syntax, words make sentences and to
    morphology, word has internal structure and has
    different inflectional forms.
  • Inflectional forms of a root word form a
    paradigm based on a principle.
  • Root word is the form which is stored in lexicons
    / dictionaries.

4
  • What is Morphology?
  • Morphology is the study of forms of words in the
    language, especially the different forms used in
    declensions, conjugations, and word building. It
    deals with the morphemes.
  • Morpheme is a term which refers to the smallest
    component of a word that (a) seems to contribute
    some sort of meaning, or a grammatical function
    to the word to which it belongs, and (b) cannot
    be decomposed into smaller morphemes.

5
Marathi MorphologyDefinition of the task and
difficulties thereto
  • Morphological analysis of Marathi plays
    significant role in natural language processing
    because Marathi, a pan Indian Language, is rich
    in morphology.
  • Marathi, being the language of the area situated
    centrally, gets influenced by almost all language
    groups of India.
  • This makes the Marathi morphology more
    complicated.

6
Marathi Morphologysolutions to the challenges
  • Morphological analysis is done category wise.
  • Parameters for changes in the root word are
    identified.
  • Rules are constructed in the tabular form to
    facilitate computation.

7
  • Marathi Word Classes
  • Nouns
  • Pronouns
  • Adjectives
  • Verbs
  • Adverbs
  • Postpositions
  • Conjunctions
  • Interjections
  • Particles
  • Punctuation Mark

8
  • Postpositions
  • Postposition is the morpheme that follows the
    words and shows the relation between the word
    that is followed and other word in the sentences.
  • Case markers and shabdayogi avyaya are classified
    as postpositions in Marathi because they show
    same behavior.
  • (ref. Classification of Words, Veena Dixit,
    proceedings of 26th AICL, Shilong, 2004 )

9
  • Postpositions (continued)
  • In Marathi, postpositions are attached to all
    classes of words except interjection. examples
  • When a postposition is attached to a stem it
    produces mainly adverb, but also, adjective and
    conjunction.
  • Postpositions are handled along with other word
    classes.
  • 5 subgroups of postpositions are identified on
    the basis of what is the possible order of their
    attachment and to which group of words they can
    be attached.

10
  • Particles
  • Strings like ?? hi_also, ? cha_only, ??????
    suddha_also, are
  • sometimes attached to other words (e.g.. ????
    khaali _under ?????????? - khaalisuddhaa_under
    also / ??? - jhaaDa _ tree - ????????? -
    jhaaDasuddhaa _ tree also )
  • or sometimes they are written separately (e.g..
    ???????? - jhaaDaakhaali_ under the tree
    ???????? ?????? - jhaaDaakhaalisuddhaa_ under the
    tree also).
  • When such words are attached to other words, the
    word to which it is attached, does not get
    inflected.

11
  • Interjections
  • Interjections are identified from the lexicon and
    stored to produce the tag.
  • Conjunctions
  • Conjunctions are identified from the lexicon and
    stored to produce the tag.
  • Morphology also plays a role in the case of
    conjunctions.

12
  • Conjunctions (continued)
  • When some of Marathi postpositions are attached
    to a pair of demonstrative pronouns, they produce
    a pair of conjunctions in some instances.
  • ?? ????????? (jo jyaapaasuna --- which from
    which)
  • ?? ????????? (to tyaapaasuna --- that from
    that)
  • ????????? ??? ??????? ????, ????????? ?? ??????
    ??????? ?????? ???. jyaapaasuna kaala suruvaata
    keli, tyaapaasun aaja nakkicha suruvaata
    karaayalaa nako_One should not start from the
    (same point) from which it was started yesterday.

13
  • Pronouns
  • Number of inflected forms of a pronoun and the
    rules describing such inflection are almost equal
    in number.
  • Number of pronouns and their respective inflected
    forms are finite and less when compared to verbs
    and nouns.
  • All inflected forms of the pronouns will be
    stored to produce the tag for pronoun.
  • Derivational morphology of pronoun is handled
    with rules.

14
  • Pronouns (continued)
  • Inflectional forms of pronouns act either as
    adjectives (???? maajhaa_my) or as adverbs (???
    malaa_to me ) or as conjunctions (??
    ????????? (jo jyaapaasuna --- which from
    which) ?? ????????? (to tyaapaasuna --- that
    from that)).

15
  • Pronouns (continued)
  • All together 29 pronouns have 526 inflectional
    forms, which are either words or stems.
  • 21 paradigms are identified generating several
    rules.

16
  • Adjectives
  • Adjectives are mainly, inflectional and non -
    inflectional.
  • Adjectives inflect for gender, number and
    attachment of postposition to the noun they
    modify.
  • Adjectives in Marathi agree in gender and number
    with the nouns they modify.

17
  • Adjectives (continued)
  • All inflectional adjectives belong to one
    paradigm, which corresponds to several rules for
    generating inflectional and derivational forms
    from an adjective.
  • Most of aa ending adjectives agree with
    masculine nouns and further get inflected
    according to the gender and number of the noun
    they modify. (????? / ????? / ????? /
    ???????_mokaLaa / mokaLi / mokaLe /
    mokaLyaa_empty)
  • There are some exceptions to this rule, such as,
    (???? - jaada_extra, ???? naanaa_different,
    ???? vaayaa_wasted).

18
  • Adverbs
  • Adverbs are mainly, inflectional and non -
    inflectional.
  • Adverbs inflect for attachment of postpositions.
  • ???? (khaali_under -- ???????? khaalapaasuna
    _from the underneath)

19
  • Verbs and Nouns will be discussed in next
    sessions.
  • Thank you.
  • Veena Dixit
  • 11/ 10 /2005
Write a Comment
User Comments (0)
About PowerShow.com