COMP791A: Statistical Language Processing - PowerPoint PPT Presentation

About This Presentation
Title:

COMP791A: Statistical Language Processing

Description:

Title: COMP791: Statistical NLP Last modified by: Leila Kosseim Created Date: 12/7/1999 2:57:41 AM Document presentation format: On-screen Show Other titles – PowerPoint PPT presentation

Number of Views:148
Avg rating:3.0/5.0
Slides: 64
Provided by: umiacsUmd1
Category:

less

Transcript and Presenter's Notes

Title: COMP791A: Statistical Language Processing


1
COMP791A Statistical Language Processing
  • Word Sense Disambiguation
  • Chap. 7

2
Overview of the problem
  • Many words have several meanings or senses
    (homonyms or polysemous words)
  • Ex chair --gt furniture or person
  • Ex dishes --gt plates or food
  • Need to determine which sense of a word is used
    in a specific sentence
  • Note
  • often, the different senses of a word are closely
    related
  • Ex title --gt right of legal ownership,
  • document that is evidence of the legal
    ownership,
  • name of work,
  • often, several senses can be activated in a
    single context (co-activation)
  • Ex This could bring competition to the trade
  • Competition --gt the act of competing AND the
    people who are competing

3
Word Sense Disambiguation (WSD)
  • To determine which of the senses of an ambiguous
    word is invoked in a particular use of the word.
  • Potentially extremely useful problem
  • Ex in machine translation
  • chair --gt (person) directeur
  • chair --gt (furniture) chaise
  • bureau --gt desk
  • bureau --gt office
  • Can be done
  • with rule-based methods
  • with statistical methods

4
WordNet
  • most widely-used lexical database for English
  • free!
  • G. Miller at Princeton www.cogsci.princeton.edu/w
    n
  • used in many applications of NLP
  • EuroWorNet
  • Dutch, Italian, Spanish, German, French, Czech
    and Estonian
  • includes entries for open-class words only
    (nouns, verbs, adjectives adverbs)

5
WordNet Entries
  • in WordNet 1.6 (now 2.0)
  • 118,000 different word forms
  • organized according to their meanings (senses)
  • each entry has
  • a dictionary-style definition (gloss) of each
    sense
  • AND a set of domain-independent lexical relations
    among
  • WordNets entries (words)
  • senses
  • sets of synonyms
  • grouped into synsets (i.e. sets of synonyms)

6
Example 1 WordNet entry for verb serve
7
Rule-based WSD
  • They served green-lipped mussels from New
    Zealand.
  • Which airlines serve Denver?
  • semantic restrictions on the predicate of an
    argument
  • argument mussels
  • --gt needs a predicate with the sense
    provide-food
  • --gt sense 6 of WordNet
  • argument Denver
  • --gt needs a predicate with the sense attend-to
  • --gt sense 10 of WordNet

8
Example 2 WordNet entry for dish
9
Rule-based WSD
  • In our house, everybody has a career and none of
    them includes washing dishes.
  • In her tiny kitchen, Ms. Chen works efficiently,
    stir-frying several simple dishes, including
    braised pigs ears and chicken livers with green
    peppers.
  • semantic restrictions on the argument of a
    predicate
  • predicate wash
  • --gt needs an argument with the sense object
  • --gt senses 1, 2 or 6 form WordNet
  • predicate stir-fry
  • --gt needs an argument with the sense food
  • --gt sense 2 of WordNet

10
Problem with rule-based WSD
  • In some cases, the constraints on the predicate
    and on the argument are not enough to pinpoint
    one unique sense
  • ex What kind of dishes do you recommend?
  • Figures of speech
  • meaning of words can be generated dynamically
  • instead of being fixed and stored in a lexicon or
    set of selectional restrictions
  • Ex metaphor, metonymy

11
Problem with rule-based WSD (cont)
  • Metaphor
  • using words / phrases whose meaning are
    appropriate to different kinds of concepts
  • suggesting a likeness or analogy between them
  • This deal does not scare Microsoft.
  • scare has 2 senses in WordNet
  • to cause fear
  • to cause to lose courage
  • metaphor the corporation is viewed as a person
  • She is drowning in money
  • metaphor money is viewed as a liquid

12
Problem with rule-based WSD (cont)
  • Metonymy
  • referring to a concept by naming some other
    concept closely related to it
  • We await word from the crown.
  • a monarch is not the same thing as a crown
  • but we often refer to the monarch as "the crown"
    because the two are associated
  • Metonymy the crown refers to the monarch
  • The White House had no comment.
  • Metonymy The White House refers to the
    administration

13
WSD versus POS tagging
  • butter can be a verb or noun
  • I should butter my toasts.
  • I like butter on my toasts.
  • 2 different POS --gt 2 different usages with 2
    different meanings
  • So WSD can be viewed as POS tagging (classifying
    using semantic tags rather than POS tags)
  • But the 2 tasks are considered different
    because
  • nearby structural cues (ex is the previous word
    a determiner?)
  • are important in POS tagging
  • are not effective for WSD
  • distant content words
  • are very effective for WSD
  • are not interesting for POS
  • So
  • in POS tagging, we typically only look at the
    local context
  • in WSD, we use content words in a larger context

14
Approaches to Statistical WSD
  • Supervised Disambiguation
  • based on a labeled training set
  • The learning system has
  • a training set of feature-encoded inputs AND
  • their appropriate sense label (category)
  • Based on Lexical Resources
  • use of external lexical resources such as
    dictionaries and thesauri
  • Discourse properties
  • Unsupervised Disambiguation
  • based on unlabeled corpora
  • The learning system has
  • a training set of feature-encoded inputs BUT
  • NOT their appropriate sense label (category)

15
Approaches to Statistical WSD
  • --gt Supervised Disambiguation
  • Naïve Bayes
  • Decision Trees
  • Use of Lexical Resources
  • Dictionary-based
  • Thesaurus-based
  • Translation-based
  • Discourse properties
  • Unsupervised Disambiguation

16
Supervised WSD Overview
  • A word is assumed to have a finite number of
    discrete senses.
  • The sense of a word depends on the sense of
    surrounding words
  • ex bass fish, musical instrument, ...

17
Supervised WSD Overview (cont)
  • WSD is viewed as typical classification problem
  • use machine learning techniques to train a system
  • that learns a classifier (a function f) to assign
    to unseen examples one of a fixed number of
    senses (categories)
  • f(input) correct sense
  • Input
  • Target word
  • The word to be disambiguated
  • Context (feature vector)
  • a vector of relevant linguistic features that
    represents its context (ex a window of words
    around the target word)

18
Examples of Feature Vectors
  • Take a window of n word around the target word
  • Encode information about the words around the
    target word
  • typical features include words, root forms, POS
    tags, frequency,
  • An electric guitar and bass player stand off to
    one side, not really part of the scene, just as a
    sort of nod to gringo expectations perhaps.
  • with position information
  • (guitar, NN1), (and, CJC), (player, NN1),
    (stand, VVB)
  • no position information, but word frequency
  • fishing, big, sound, player, fly, rod, pound,
    double, runs, playing, guitar, band
  • 0,0,0,1,0,0,0,0,0,0,1,0
  • other features
  • followed by "player", contains "show" in the
    sentence,
  • yes, no,

19
Supervised WSD
  • Training corpus
  • Each occurrence of the ambiguous word w is
    annotated with a semantic label (its contextually
    appropriate sense sk).
  • Several approaches from ML
  • Bayesian classification
  • Decision trees
  • Neural networks
  • K-nearest neighbor (kNN)

20
Approaches to Statistical WSD
  • --gt Supervised Disambiguation
  • --gt Naïve Bayes
  • Decision Trees
  • Use of Lexical Resources
  • Dictionary-based
  • Thesaurus-based
  • Translation-based
  • Discourse properties
  • Unsupervised Disambiguation

21
Naïve Bayes Classification
  • Goal choose the most probable sense s for a
    word given a vector V of surrounding words
  • vector contains
  • frequency of words
  • vocabulary fishing, big, sound, player, fly,
    rod,
  • 0, 0, 0, 2, 1, 0,
  • Bayes decision rule
  • s argmaxsk P(skV)
  • where
  • S is the set of possible senses for the target
    word
  • sk is a sense in S
  • V is the feature vector (the representation of
    the context)
  • Using Bayes rule

22
Decision Rule for Naive Bayes
  • But P(V) is the same for all possible senses,
    so it does not affect the final ranking of the
    senses, so we can drop it.
  • To make the computations simpler, we often take
    the log of probabilities

23
Naïve Bayes WSD
  • Training a Naïve Bayes classifier
  • estimating P(vjsk) and P(sk) from a
    sense-tagged training corpus
  • finding Maximum-Likelihood Estimation, perhaps
    with appropriate smoothing

Nb of occurrences of feature j over the total nb
of features appearing in windows of Sk
Nb of occurrences of sense k over nb of all
occurrences of ambiguous word
24
Naïve Bayes Algorithm
  • // 1. training
  • for all senses sk or word w
  • for all words vj in the vocabulary
  • compute
  • for all senses sk of word w
  • compute
  • // 2. disambiguation
  • for all senses sk of word w
  • score(sk) log P(sk)
  • for all words vj in the context window
  • score (sk) score (sk) log P(vj sk)
  • choose s with the greatest score(sk)

25
Example
  • Training corpus (context window ?3 words)
  • Today the World Bank/BANK1 and partners are
    calling for greater relief
  • Welcome to the Bank/BANK1 of America the
    nation's leading financial institution
  • Welcome to America's Job Bank/BANK1 Visit our
    site and
  • Web site of the European Central Bank/BANK1
    located in Frankfurt
  • The Asian Development Bank/BANK1 ADB a
    multilateral development finance
  • lounging against verdant banks/BANK2 carving out
    the...
  • for swimming, had warned her off the banks/BANK2
    of the Potomac. Nobody...
  • Training
  • P(theBANK1) 5/30 P(theBANK2) 3/12
  • P(worldBANK1) 1/30 P(worldBANK2) 0/12
  • P(andBANK1) 1/30 P(andBANK2) 0/12
  • P(offBANK1) 0/30 P(offBANK2) 1/12
  • P(PotomacBANK1) 0/30 P(PotomacBANK2) 1/12

26
Naïve Bayes Assumption
  • Independence assumption
  • The features (contextual words) are conditionally
    independent
  • Probability of an entire feature vector given a
    sense, is the product of the probabilities of its
    individual features given that sense
  • Consequences
  • Bag of words model
  • the structure and linear ordering of words within
    the context is ignored.
  • The presence of one word in the bag is
    independent of another.
  • The independence assumption is incorrect but is
    useful in WSD
  • (Gale, Church Yarowsky, 1992) report 90
    correct disambiguation with 6 ambiguous nouns in
    the Hansard

27
Approaches to Statistical WSD
  • --gt Supervised Disambiguation
  • Naïve Bayes
  • --gt Decision Trees
  • Use of Lexical Resources
  • Dictionary-based
  • Thesaurus-based
  • Translation-based
  • Discourse properties
  • Unsupervised Disambiguation

28
Decision Tree Classifier
  • Bayes Classifier uses information from all words
    in the context window
  • But some words are more reliable than others to
    indicate which sense is used

29
Decision Tree Classifier (cont)
  • Look for features that are very good indicators
    of the result
  • Place these features (as questions) in nodes of a
    decision tree
  • Split the examples so that those with different
    values for the chosen feature are in a different
    set
  • Repeat the same process with another feature
  • A sequence of tests is applied to each feature
    vector
  • if test succeeds --gt return the sense associated
    with the test
  • otherwise --gt apply the next test
  • if all features have been tested, then return a
    default sense (most common one)

30
Example bass
Observation Features Features Features Features Features Sense
Observation Includes fish? striped bass? Includes guitar? bass player? Includes piano? Sense
1 Yes Yes No No No fish
2 Yes Yes No No No fish
3 No No Yes No No instrument
4 No Yes No No No fish
5 Yes Yes No No No fish
6 No No Yes Yes Yes instrument
7 No Yes No No No fish
yes
no
no
yes
yes
no
31
Another Example The restaurant
Input
  • Training data

Output
32
A first decision tree
  • But is it the best decision tree we can build?

33
A better decision tree
  • 4 tests instead of 9 11 branches instead of 21

34
Choosing the best feature
  • The key problem is choosing which feature to
    split a given set of examples
  • Most used strategy information theory

Entropy (or self-information)
35
Choosing the best feature (con't)
  • The "discriminating power" of an attribute A
    given a set S
  • if the training set contains
  • p positive examples and
  • n negative examples

36
Some intuition
Size Color Shape Output
Big Red Circle
Small Red Circle
Small Red Square -
Big Blue Circle -
  • Size is the least discriminating attribute (i.e.
    smallest information gain)
  • Shape and color are the most discriminating
    attribute (i.e. highest information gain)

37
A small example
Size Color Shape Output
Big Red Circle
Small Red Circle
Small Red Square -
Big Blue Circle -
  • So first separate according to either color or
    shape (root of the tree)
  • Note by definition 0log0 is 0

38
The restaurant example
  • With the data on p.27, we have
  • So root of the tree should be attribute Patrons
    (we gain more information)
  • do recursively for subtrees

39
Back to WSD
  • Need to translate the French word Prendre
  • can be seen as WSD
  • possible translations/sensestake, make, rise,
    speak

Observation Features/Attributes Features/Attributes Features/Attributes Features/Attributes Features/Attributes Sense
Observation Tense Word left Direct object Word right Sense
1 mesure take
2 note take
3 exemple take
4 decision make
5 parole speak
6 parole rise
40
Back to WSD (con't)
  • (Brown et al., 1991) found
  • On Canadian Hansard

Ambiguous word Possible senses / translations Best Feature Example
Prendre take , make, rise, speak Direct object Prendre une mesure --gt to take Prendre une décision --gt to make
Vouloir to want, to like Tense Present --gt to want Conditional --gt to like
Cent , Word to the left Pour --gt Number --gt
41
Training Set
  • With supervised methods, we need a large
    sense-tagged training set where do you get it
    from?
  • Using a "real" training set
  • Main standard hand sense-tagged corpora
  • SEMCOR corpus
  • portion of the Brown corpus
  • tagged with WordNet senses
  • SENSEVAL corpus (www.senseval.org/)
  • Standard WSD competition like MUC, TREC DUC
  • Open Mind Word Expert(OMWE)
  • Using pseudowords
  • Artificial ambiguous words created by conflating
    two or more words.
  • Ex occurrences of banana and door can be
    replaced by banana-door
  • The disambiguation algorithm can now be tested on
    this data to disambiguate the pseudoword
    banana-door into either banana or door

42
Problems
  • With supervised (or unsupervised) methods
  • need a large amount of work to create a
    classifier for each ambiguous word!
  • So most work based in these techniques, report
    work on a few words (2 to 12 words)
  • Scaling up these approaches to deal with all
    ambiguous words is immense work!
  • Solution
  • use lexical resources (ex machine-readable
    dictionaries)
  • use distributional properties to improve
    disambiguation
  • Ambiguous words are only used in one sense in any
    given discourse and with any given collocate.

43
Approaches to Statistical WSD
  • Supervised Disambiguation
  • Naïve Bayes
  • Decision-tree
  • --gt Use of Lexical Resources
  • --gt Dictionary-based
  • Thesaurus-based
  • Translation-based
  • Discourse properties
  • Unsupervised Disambiguation

44
WSD based on sense definitions
  • (Lesk, 1986)
  • A words dictionary definitions are likely to be
    good indicators for the sense they define.
  • Method
  • Express the dictionary definitions of the
    ambiguous word as sets of bag-of-words
  • Express the context of the ambiguous word as a
    single bag-of-words from the dictionary
    definitions of the context words.
  • Choose the definition of the ambiguous word that
    has the greatest overlap with the words occurring
    in its context.

45
Example
  • "Cone" in dictionary
  • DEF-1 solid body which narrows to a point
  • BAG body, narrows, point, solid
  • DEF-2 something of this shape whether solid or
    hollow
  • BAG hollow, shape, something, solid
  • DEF-3 fruit of certain evergreen tree
  • BAG evergreen, fruit, tree
  • To disambiguate "cone" in "pine cone"
  • "Pine" in dictionary
  • DEF-1 kind of evergreen tree
  • DEF-2 waste away through sorrow or illness
  • --gt BAG evergreen, illness, kind, sorrow,
    tree, waste
  • so "cone" is
  • score(DEF-1) body, narrows, point, solid ?
    evergreen, illness, kind, sorrow, tree, waste
  • 0
  • score(DEF-2) hollow,shape,something,solid ?
    evergreen, illness, kind, sorrow, tree, waste
  • 0

46
The algorithm
  • For all senses sk of word w
  • score(sk) overlap (
  • - words in the dictionary definition of sense sk
  • - the union of the words in all context windows
    that also appear in a definition of w
  • )
  • pick the sense s with the highest score(sk)

47
Analysis
  • Accuracies of 50-70 on short samples of texts
  • Problem
  • dictionary entries for the target words are
    usually relatively short
  • and may not provide sufficient material to create
    adequate classifiers
  • Because the words in the context and their
    definitions must have direct overlap
  • One solution
  • expand the list of words whose definitions make
    use of the target word
  • Example
  • if deposit does not occur in the definition of
    bank
  • but bank occurs in the definition of deposit
  • We can expand the classifier for bank to
    include deposit as a relevant feature
  • However
  • just knowing that deposit is related to bank
    does not help much
  • if we do not know to which sense of bank it is
    related to
  • --gt To make use of deposit as a feature, we
    have to know which sense of bank was being used
    in the definition
  • Solution
  • Use a thesaurus

48
Approaches to Statistical WSD
  • Supervised Disambiguation
  • Naïve Bayes
  • Decision-tree
  • --gt Use of Lexical Resources
  • Dictionary-based
  • --gt Thesaurus-based
  • Translation-based
  • Discourse properties
  • Unsupervised Disambiguation

49
Thesaurus-Based Disambiguation
  • Thesauri include tags (subject codes) in their
    entries that correspond to broad semantic
    categories
  • Each word is assigned one or more subject codes
    which corresponds to its different meanings
  • ANIMAL/INSECT (category 414)
  • TOOLS/MACHINERY (category 348)
  • The semantic categories of the words in a context
    determine the semantic category of the whole
    context
  • This category, determines which word senses are
    used
  • For each subject code, count the number of words
    in the context that have the same subject code
  • Select the subject code that has the highest
    count
  • Accuracy 50 (but with difficult and highly
    ambiguous words)

50
Some Results
  • Roget categories

Word Sense Roget category Accuracy (Yarowsky, 1992)
bass musical instrument MUSIC 99
bass fish ANIMAL,INSECT 100
star space object UNIVERSE 96
star celebrity ENTERTAINER 95
star star-shaped object INSIGNIA 82
interest curiosity REASONING 88
interest advantage INJUSTICE 34
interest financial DEBT 90
interest share PROPERTY 38
51
Approaches to Statistical WSD
  • Supervised Disambiguation
  • Naïve Bayes
  • Decision-tree
  • --gt Use of Lexical Resources
  • Dictionary-based
  • Thesaurus-based
  • --gt Translation-based
  • Discourse properties
  • Unsupervised Disambiguation

52
Translation-Based WSD
  • Words can be disambiguated by looking at how they
    are translated in other languages
  • Example the word interest
  • To disambiguate the word interest in showed
    interest
  • German translation of show is zeigen
  • In German corpus
  • we always find zeigen interesse
  • we never find zeigen beteiligung
  • So in the original phrase showed interest,
    interest had sense2
  • To disambiguate the word interest in acquired
    an interest
  • German translation of acquired is erwarb
  • In German corpus C(erwarb, beteiligung) gt
    C(erwarb, interesse)

sense1 sense2
Definition legal share attention, concern
German Translation Beteiligung Interesse
English phrase acquire an interest show interest
Translation erwerb eine Beteiligung Interesse zeigen
53
Approaches to Statistical WSD
  • Supervised Disambiguation
  • Naïve Bayes
  • Decision-tree
  • Use of Lexical Resources
  • Dictionary-based
  • Thesaurus-based
  • Translation-based
  • --gt Discourse properties
  • Unsupervised Disambiguation

54
Discourse Properties (Yarowsky, 1995)
  • So far, all methods have considered each
    occurrence of ambiguous word separately
  • But
  • One sense per discourse
  • One document --gt one sense
  • One sense per collocation
  • Select some nearby word that give very clues
    ie. select words of a collocation lt-gt sense of
    target word
  • (Yarowsky , 1995) shows a reduction of error rate
    by 27 when using the discourse constraint!
  • i.e. assign the majority sense of the discourse
    to all occurrences of the target word
  • we can combine these 2 heuristics

55
Approaches to Statistical WSD
  • Supervised Disambiguation
  • Naïve Bayes
  • Decision-tree
  • Use of Lexical Resources
  • Dictionary-based
  • Thesaurus-based
  • Translation-based
  • Discourse properties
  • --gt Unsupervised Disambiguation

56
Unsupervised Disambiguation
  • Disambiguate word senses
  • without supporting tools such as dictionaries and
    thesauri
  • without a labeled training text
  • Without such resources, we cannot really
    identify/label the senses
  • ie. cannot say bank-1 or bank-2
  • we do not even know the different senses of a
    word!
  • But we can
  • Cluster/group the contexts of an ambiguous word
    into a number of groups
  • discriminate between these groups without
    actually labeling them

57
Clustering
  • Represent each instance of the ambiguous word as
    a vector ltf1, f2, f3,, fv gt
  • V is the vocabulary size
  • fi is the frequency of word i in the context.
  • each vector can be visually represented in an V
    dimensional space

V2
word2
V1
word1
V3
word3
58
Clustering
  • hypothesis same senses of words will have
    similar neighboring words
  • Disambiguation algorithm
  • Identify context vectors corresponding to all
    occurrences of a particular word
  • Partition them into regions of high density
  • Tag a sense for each such region
  • Disambiguating a word
  • Compute context vector of its occurrence
  • Find the closest centroid of a region
  • Assign the occurrence the sense of that centroid

59
Evaluating WSD
  • Metrics
  • Accuracy the of words that are tagged
    correctly
  • Precision Recall
  • Good nb of correct answers provided by the
    system
  • Bad nb of wrong answers provided by the system
  • Null nb of cases in which the system doesnt
    provide any answer
  • compared to a gold standard
  • SEMCOR corpus, SENSEVAL corpus, original text
    without pseudo-words,
  • Difficulty in evaluation
  • Nature of the senses to distinguish has a huge
    impact on results
  • coarse VS fine-grained sense distinction
  • ex chair --gt person VS furniture
  • ex bank --gt financial institution VS building

60
Bounds on Performance
  • Upper and Lower Bounds on Performance
  • Measure of how well an algorithm performs
    relative to the difficulty of the task.
  • Upper Bound
  • Human performance
  • Around 97-99 with few and clearly distinct
    senses
  • Inter-judge agreement
  • With words with clear distinct senses --gt 95
    and up
  • With polysemous words with related senses ?
    65-70
  • Lower Bound (or baseline)
  • Usually the assignment of the most frequent sense
  • 90 is excellent for a word with 2 equiprobable
    senses
  • 90 is trivial for a word with 2 senses with
    probability ratios of 9 to 1 !!!

61
SENSEVAL (www.senseval.org)
  • Standard WSD competition like MUC, TREC DUC
  • Goals
  • Provide a common framework to compare WSD systems
  • Standardise the task (especially evaluation
    procedures)
  • Build and distribute new lexical resources
  • Senseval-1 (1998)
  • English, French and Italian
  • HECTOR senses (Oxford University Press)
  • Senseval-2 (2001)
  • 13 languages, including Chinese
  • WordNet senses
  • Senseval-3 (March 2004)
  • 7 languages (but various tasks)
  • WordNet senses

62
Training text for "arm" (SENSEVAL-1)
  • ltinstance id"arm.n.om.053"gt ltanswer
    instance"arm.n.om.053" senseid"arm10800"/gt
  • ltcontextgt
  • Many ltp"JJ"/gt terrestrial ltp"JJ"/gt vertebrate
    ltp"JJ"/gt animals ltp"NNS"/gt have ltp"VBP"/gt four
    ltp"CD"/gt ltne"_NUM"/gt limbs ltp"NNS"/gt .
    ltp"."/gt Those ltp"DT"/gt attached ltp"VBN"/gt to
    ltp"TO"/gt the ltp"DT"/gt thoracic ltp"JJ"/gt
    portion ltp"NN"/gt of ltp"IN"/gt the ltp"DT"/gt body
    ltp"NN"/gt are ltp"VBP"/gt called ltp"VBN"/gt "
    ltp"""/gt ltheadgt arms ltp"NNS"/gt lt/headgt .
    ltp"."/gt " ltp"""/gt
  • lt/contextgt lt/instancegt
  • ltinstance id"arm.n.om.045"gt ltanswer
    instance"arm.n.om.045" senseid"arm10602"/gt
  • ltcontextgt You ltp"PRP"/gt are ltp"VBP"/gt likely
    ltp"JJ"/gt to ltp"TO"/gt find ltp"VB"/gt a ltp"DT"/gt
    rocking_chair ltp"NN"/gt with ltp"IN"/gt ltheadgt
    arms ltp"NNS"/gt lt/headgt in ltp"IN"/gt a ltp"DT"/gt
    museum ltp"NN"/gt
  • lt/contextgt lt/instancegt
  • ltinstance id"arm.n.la.029"gt ltanswer
    instance"arm.n.la.029" senseid"arm10601"/gt
  • ltcontextgt
  • " ltp"""/gt Unlike ltp"IN"/gt Linder ltp"NNP"/gt ,
    ltp","/gt who ltp"WP"/gt was ltp"VBD"/gt reportedly
    ltp"RB"/gt carrying ltp"VBG"/gt a ltp"DT"/gt
    Kalashnikov ltp"NNP"/gt assault_rifle ltp"NN"/gt
    for ltp"IN"/gt protection ltp"NN"/gt , ltp","/gt
    APSNICA ltp"NNP"/gt volunteers ltp"NNS"/gt do
    ltp"VBP"/gt not ltp"RB"/gt bear ltp"VB"/gt ltheadgt
    arms ltp"NNS"/gt lt/headgt . ltp"."/gt
  • lt/contextgt lt/instancegt

63
What is a word sense anyways?
  • A mental representations of different meaning of
    a word
  • Experiments in psycho-linguistics
  • Ask subjects classify index cards with sentences
    containing an ambiguous words into different
    piles
  • But inter-subject agreement is low
  • Rely on introspection
  • But introspection tends to rationalize often
    non-rational decisions
  • Ask subjects to classify ambiguous words
    according to dictionary definitions
  • Some results show high inter-subject agreement,
    some results show low agreement!!!
Write a Comment
User Comments (0)
About PowerShow.com