Computational linguistics - PowerPoint PPT Presentation

1 / 10
About This Presentation
Title:

Computational linguistics

Description:

Concatenative morphology combines morphemes by concatentation (prefixes and ... consonant-vowel (CV) template: CVCCVC (causative) vocalization: ui (perfect passive) ... – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 11
Provided by: carlal
Category:

less

Transcript and Presenter's Notes

Title: Computational linguistics


1
Computational linguistics
  • September 9, 2002
  • Overview
  • Morphology
  • CL applications
  • inflectional vs. derivational
  • morphological parsing, finite state transducers

2
Applications
  • lexicons
  • stemming
  • generating correct surface word forms

3
Terminology
  • Morpheme smallest meaning-bearing unit
  • Free independent words
  • Bound affix
  • prefix (un-)
  • suffix (-ing)
  • circumfix (German ge-t sagen ? gesagt)
  • infix (Bontoc Igorot -in- kayu ? kinayu)
  • example from Linguistics, second edition, by
    Akmajian, Demers and Harnish, page 59

4
Concatenative vs. nonconcatenative
  • Concatenative morphology combines morphemes by
    concatentation (prefixes and suffixes demonstrate
    this)
  • Non-concatentative morphology combines in a
    non-concatenative manner
  • circumfixes and infixes
  • templatic morphology

5
Templatic morphology
  • Semitic languages (Arabic, Hebrew)
  • stem (root), e.g. ktb (write)
  • consonant-vowel (CV) template CVCCVC (causative)
  • vocalization ui (perfect passive)
  • Combination consonants in stem map onto Cs in
    template, vowels in vocalization map onto Vs to
    yield surface form kuttib

6
Try it out!
  • http//www.xrce.xerox.com/competencies/content-ana
    lysis/arabic/input/keyboard_input.html

7
Morphology
  • Inflectional morphology
  • usually does not change grammatical category
  • Ex cat(N) /plural/ cats(N)
  • Derivational morphology
  • typically changes grammatical category
  • Ex program(N) able programmable(ADJ)

8
Parsing
  • refers to the recovery of structure from analysis
    of input
  • often refers to the processing of sentences
  • can also refer to the processing of words
  • Stemming refers to the recovery of a word stem
    given a surface form of the word
  • uncharacteristically un character istic
    ally

9
Lexicons
  • One approach list all words
  • difficult in English because some morphology is
    productive
  • (table adapted from page 62 of text)

10
Options
  • Not only do some affixes attach to large numbers
    of stems,
  • they also attach to new words in the language
  • spam, spams, spamming, spammed, spammer
  • Idea encode morphological rules to generate all
    forms of words from a minimal set of word stems.
  • Figure 3.4, 3.5
  • Exercise 3.1, 3.2
Write a Comment
User Comments (0)
About PowerShow.com