Morphology - PowerPoint PPT Presentation

About This Presentation
Title:

Morphology

Description:

dictionary lookup. syntactic analyzer. lexical- semantic analysis. discourse processing ... INFIXATION: ( TAGALOG) fikas - strong. fumikas - be strong ... – PowerPoint PPT presentation

Number of Views:134
Avg rating:3.0/5.0
Slides: 51
Provided by: amywei
Category:

less

Transcript and Presenter's Notes

Title: Morphology


1
Morphology
  • What is morphology?
  • Finite State Transducers
  • Two Level Morphology

2
What is morphology?
  • Decomposition of words into meaningful units
  • anti dis establish ment arian ism
  • Interacts with- syntax( categories and word
    order)
  • establish verb ment
    noun
  • phonology divine
    divinity
  • obscene
    obscenity
  • Interacts with semantics
  • boy boys
  • Peter Peterchen

3
Phonological String
morphological analyzer
dictionary lookup

syntactic analyzer
lexical- semantic
analysis discourse
processing


4
Why store all words as morphemes rather than
all Morphological combinations as
words? What does the morphological analyzer
have to output?
5
The what and the how
  • Efficient and effective algorithm to decompose
    categories into,
  • or build categories from, component
    morphemes.
  • What this algorithm will be depends on problems
    it has to solve.
  • In turn depends on representations computed.
  • Given stem /lemma ( e.g. jump add material to
    change category
  • Or grammatical properties of word jumped,
    jumpable
  • order of composition matters
  • ride/ riding
  • enoble/ enobling/nobling Adj ---gt V,
    Vgt Ving
  • trance/trancing/entrance/entrancing

6
CONCATENATIVE MORPHOLOGICAL PROCESSES
COMPOUNDING firefighter PREFIXATION Un
well INFIXATION ( TAGALOG) fikas -
strong fumikas - be strong SUFFIXATION Kick
er CIRCUMFIXATION ( German) ge sag t
past prefix say past suffix
7
Inflectional Morphology
  • non category changing, required by syntax
  • Agreement person/number
  • Je parle
  • Nous parlons
  • Ils parlent
  • Gender
  • la petite ( the little one (fem))
  • le petit ( the little one (masc))
  • la squelette ( the skeleton)

8
Derivational Morphology
  • changes category. Not required by syntax
  • Deverbal Nominal
  • baker tion destroy/destruction
  • catch er Roman's destruction of
    the city
  • 'er' agent of action Catcher of the ball
  • Johns catcher
    of the ball
  • 'John" one who caught

9
Regular vs Irregular Jump/jumped hit/hit
bring/brought sing/sang Productive/Non-Produc
tive adore/adorable, kick/kickable,
fax/faxable produce/production
destroy/destruction graft/graftuction Bring/
brought
10
Regular (English) Verbs
Morphological Form Classes Regularly Inflected Verbs Regularly Inflected Verbs Regularly Inflected Verbs Regularly Inflected Verbs
Stem walk merge try map
-s form walks merges tries maps
-ing form walking merging trying mapping
Past form or ed participle walked merged tried mapped
11
Irregular (English) Verbs
Morphological Form Classes Irregularly Inflected Verbs Irregularly Inflected Verbs Irregularly Inflected Verbs
Stem eat catch cut
-s form eats catches cuts
-ing form eating catching cutting
Past form ate caught cut
-ed participle eaten caught cut
12
To love in Spanish
13
  • Productive and rule governed
  • fax fax er
  • ??? Crudoy cruduction
  • Category sensitivity
  • breakable/ manable
  • sensitivity/ hittivity
  • Semantic sensitivity
  • un well un happy
  • un ill un sad

14
Store morphemes or words?
lebensversicherungsgesellschaftsangesteller leben
versicherung gesellschaftsangesteller life
insurance company Poss
employee Turkish Turkish verns have 40k forms
15
Non- concatenative Morphology
  • Templatic morphology (Semitic languages)lmd
    (learn), lamad (he studied), limed (he taught),
    lumad (he was taught)

16
Concatenation Beads on a string
Agglutinative ( concatenative) languages are well
behaved for FSAs as long as we dont include
phonological or spelling changes
Verb Lexicon jumped
jump kissed
kiss streamed
stream hopped
hop, ???
verb
ed
q
q 1
q
q1
q2
0
17
Pieces of a Morphological Analyzer
-er,est,ly
un
adj-root
q2
q3
q0
q1
The lexicon stores the lemmas, and divides them
into adjective classes really/clearly
bigly/redly Morphotactics State sequence
indicates order of morpheme composition e.g.
comparative or adverb formation is by suffixation
18
Lexicon
  • Arranged as TRIE ( letter strings in common
    relative to position
  • n-k-e-y
  • D-o
  • -g
  • Classed by part of speech category ( noun,
    verb) and morphotactic
  • (which other affixes can precede or follow)
  • or orthographic considerations.

19
Orthography
  • spelling rules- handle phonological or spelling
    variation in
  • orthographic a morpheme
  • Try /trying/tries
  • Cringe/cringing/cringes

20
FSA for Inflectional Morphology English Nouns
21
FSA for Inflectional Morphology English Verbs
22
FSA for Derivational Morphology Adjectival
Formation
23
More Complex Derivational Morphology
24
Using FSAs for Recognition English Nouns and
their Inflection
25
  • Orthographic
  • Want association between morpheme and semantic
    function
  • Want association between allographs or
    allophones of the same
  • phoneme
  • Allographs
  • city -cities
  • bake- baking
  • divine-divinity
  • try tried

26
Finite State Transducers (FSTs)- the Big
Idea Need to relate lexical level, the level
that gives us the morphological analysis
(plural,able to the surface level that keeps
track of phonological/ or graphological
(spelling_ changes)
27
Parsing vs recognition
  • An FSA can give you the string composition of a
    morphological sequence, and can tell you whether
    a given morphological string is or is not in the
    language. It recognizes the string
  • An FST parses the string. It tells you the
    morphological structure associated with the
    string. Other instances of parsing?

28
Formal definition
  • An FST defines a relation between sets of pairs
    of strings
  • It contains at least a lexical level that is a
    concatenation of morphemes
  • and a surface level that shows the correct
    spelling for each
  • morpheme in a given context
  • cat/sheep
    s
  • e.g. noun (instanciated from lexicon) plural

  • E s
  • cats/sheep

29
Q finite set of states q0 to qn ????finite
alphabet of complex symbols (feasible pairs)
io with one symbol from the input alphabet Q0
the start state F set of final states ?
(q, io) the transition function or
matrix??between states. Takes a state
from Q and a complex symbol io from
??and returns a new state. feasible pair a
relation of a symbol on one tape to a symbol
on the other tape. e.g. can
pls
30
  • default pair- the upper tape is the same as
    the lower tape
  • same input as output
    cat/ccaattpls
  • feasible pairs either stated in lexicon if
    irregular
  • ggoeoessee goosegeese
  • or by an automaton that stipulates correspondence
    in rule
  • governed way if the relation is regular. If
    regular, indicated as
  • Default paris and usually represented by one
    symbol.
  • FSTs are closed under

  • inversion switches i/o labels

  • composition union of two transducers

  • one after the other.


31
trie in lexicon, categories arranged by letter
one at a time with class at end. Allows parallel
search as long as things match e.g. metal ltNgt
meta ltrootgt metal, meta-language
32
Kimmo-BasedMorphological Parsing
  • Two-level morphology lexical level surface
    level (Koskenniemi 83)
  • Finite-state transducers (FST) input-output pair

33
Four-Fold View of FSTs
  • As a recognizer
  • As a generator
  • As a translator
  • As a set relater

34
Terminology for Kimmo
  • Upper lexical tape
  • Lower surface tape
  • Characters correspond to pairs, written ab
  • If ab, write a for shorthand
  • Two-level lexical entries
  • word boundary
  • morpheme boundary
  • Other any feasible pair that is not in this
    tranducer

35
Nominal Inflection FST
36
Lexical and Intermediate Tapes
37
Spelling Rules
Name Rule Description Example
Consonant Doubling 1-letter consonant doubled before -ing/-ed beg/begging
E-deletion Silent e dropped before -ing and -ed make/making
E-insertion e added after s,z,x,ch,sh before s watch/watches
Y-replacement -y changes to -ie before -s, -i before -ed try/tries
K-insertion verbs ending with vowel -c add -k panic/panicked
38
Notation
x s z
__ s
e --gt e /
39
Intermediate-to-Surface Transducer
40
Two-Level Morphology
41
Sample Run
42
FSTs and ambiguity
Parse Example 1 unionizable Parse Example 2
assess
43
What to do about Global Ambiguity?
  • Accept first successful structure
  • Run parser through all possible paths
  • Bias the search in some manner

44
Some Limitations
45


46
(No Transcript)
47
Stemming
  • For some applications,dont need full
    morphological analysis.
  • IR- dont care that e.g logician is related
    to logical Just want
  • to know that if you are interested in articles
    about logic
  • may want former two classes as well. So just
    want to get back
  • to root list.
  • Relate two forms by having a literal relation
    rule. E.g
  • al---gt 0
  • Is it useful in a big document may not be
    necessary because the
  • will appear in many forms including form in
    query

48
  • stemming is morphologically impoverished so
    error driven
  • - cant distinguish rules that apply at
    morpheme boundaries
  • versus internal to root
  • patronization patron ize ation
  • organization organize ation
  • But the stemmer will treat these as a single
    class and derive
  • organ as an underlying root.
  • -adverse/adversity
  • universe / university

49
Psycholinguistics
  • Is the human lexicon efficient in the way
    computational lexica
  • are?
  • -Stanners et al (1979) where two words are
    related inflection-
  • ally,then root stored and other forms rule
    derived. Where
  • there is a derivational relationship, then
    both forms are stored
  • paradigm repetition priming
  • great, happy, peachy, adorable , round, short,
    great

  • small
  • Repetition priming for turns given turning
    but not
  • select, selective

50
  • Marslen- Wilson et al (1994) May have priming
    for
  • Semantically similar derivationally related
    words
  • permit/permission
  • create/creativity
  • On-line versus long term storage lexicon
  • Speech errors we have screw looses
Write a Comment
User Comments (0)
About PowerShow.com