WORD SENSE DISAMBIGUATION: AN INTRODUCTION - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

WORD SENSE DISAMBIGUATION: AN INTRODUCTION

Description:

Word sense disambiguation is the problem of selecting a sense for a ... holonym: {Plantae, kingdom Plantae, plant kingdom} NILESH.A.SHEWALE. 18. Lesk Algorithm ... – PowerPoint PPT presentation

Number of Views:969
Avg rating:3.0/5.0
Slides: 46
Provided by: a15120
Category:

less

Transcript and Presenter's Notes

Title: WORD SENSE DISAMBIGUATION: AN INTRODUCTION


1
WORD SENSE DISAMBIGUATION AN INTRODUCTION
  • By
  • NILESH.A.SHEWALE
  • M.S. (LIS) II YEAR
  • DOCUMENTATION RESEARCH AND TRAINING CENTRE
  • INDIAN STATISTICAL INSTITUTE
  • BANGALORE

2
Overview
  • Definition.
  • What is word sense?
  • Ambiguity for Humans and Computers.
  • Brief historical view.
  • Approaches or methods of WSD.
  • Application of WSD
  • Evaluation of WSD
  • Conclusion

3
Definitions
  • Word sense disambiguation is the problem of
    selecting a sense for a word from a set of
    predefined possibilities.
  • Sense Inventory usually comes from a dictionary
    or thesaurus.
  • Knowledge intensive methods, supervised learning,
    and (sometimes) bootstrapping approaches
  • Word sense discrimination is the problem of
    dividing the usages of a word into different
    meanings, without regard to any particular
    existing sense inventory.
  • Unsupervised techniques

4
Overview
  • Definition.
  • What is word sense?
  • Ambiguity for Humans and Computers.
  • Brief historical view.
  • Approches or methods of WSD.
  • Application of WSD
  • Evaluation of WSD
  • Conclusion

5
What is word sense?
  • Word sense is one of the meanings of a word .
  • Words are having different meanings based on
    the context of the word usage in a sentence.
  • Example-
  • -We went to see a play at the theater
  • -The children went out to play in the park

6
  • Definition.
  • What is word sense?
  • Ambiguity for Humans and Computers.
  • Brief historical view.
  • Approches or methods of WSD.
  • Application of WSD
  • Evaluation of WSD
  • Conclusion

7
Ambiguity for humans and computers
  • Computer versus human
  • Polysemy Many words have many possible
    meanings.
  • A computer program has no basis for knowing which
    one is appropriate, even if it is obvious to a
    human
  • Ambiguity is rarely a problem for humans in their
    day to day communication, except in extreme
    cases

8
Ambiguity for humans Newspaper Headline!!!
  • FARMER HIGH BILL DIES IN HOUSE
  • PROSTITUTES APPEAL TO POPE
  • STOLEN PAINTING FOUND BY TREE
  • RED TAPE HOLDS UP NEW BRIDGE
  • DEER KILL 300,000
  • RESIDENTS CAN DROP OFF TREES
  • INCLUDE CHILDREN WHEN BAKING COOKIES
  • MINERS REFUSE TO WORK AFTER DEATH

9
Ambiguity for Computers!
  • The fisherman jumped off the bank and into the
    water.
  • The bank down the street was robbed!
  • Back in the day, we had an entire bank of
    computers devoted to this problem.
  • The bank in that road is entirely too steep and
    is really dangerous.
  • The plane took a bank to the left, and then
    headed off towards the mountains.

10
  • Definition.
  • What is word sense?
  • Ambiguity for Humans and Computers.
  • Brief historical view.
  • Approches or methods of WSD.
  • Application of WSD
  • Evaluation of WSD
  • Conclusion

11
Brief historical view
  • Identified as a problem for Machine Translation
    (Weaver, 1949)
  • A word can often only be translated if you know
    the specific sense intended (A bill in English
    could be Vidhayk in Hindi)
  • Bar-Hillel (1960) posed the following
  • Little John was looking for his toy box. Finally,
    he found it. The box was in the pen. John was
    very happy.
  • Is pen a writing instrument or an enclosure
    where children play?
  • declared it unsolvable, left the field of MT!

12
  • 1970s - 1980s
  • Rule based systems
  • 1990s
  • Corpus based approaches
  • Dependence on sense tagged text
  • 2000s
  • Hybrid Systems
  • Minimizing or eliminating use of sense tagged
    text
  • Taking advantage of the Web

13
  • Definition.
  • What is word sense???
  • Ambiguity for Humans and Computers.
  • Brief historical view.
  • Approaches or methods of WSD.
  • Application of WSD
  • Evaluation of WSD
  • Conclusion

14
Approaches to WSD
  • Knowledge-Based Disambiguation
  • use of external lexical resources such as
    dictionaries and thesauri
  • Supervised Disambiguation
  • based on a labelled training set
  • the learning system has
  • a training set of feature-encoded inputs AND
  • their appropriate sense label (category)
  • Unsupervised Disambiguation
  • based on unlabeled corpora
  • The learning system has
  • a training set of feature-encoded inputs BUT
  • NOT their appropriate sense label (category)
  • Minimally Supervised WSD

15
Knowledge based Disambiguation
  • Knowledge-based WSD class of WSD methods
    relying (mainly) on knowledge drawn from
    dictionaries and/or raw text.
  • Resources
  • Yes
  • Machine Readable Dictionaries
  • Raw corpora
  • No
  • Manually annotated corpora
  • Scope
  • All open-class words

16
Machine Readable Dictionaries
  • In recent years, many dictionaries are made
    available in Machine-readable form (MRD)
  • Oxford English Dictionary
  • Collins
  • Longman Dictionary of Ordinary Contemporary
    English (LDOCE)
  • Thesaurus add synonymy information
  • Rogets Thesaurus
  • Semantic networks add more semantic relations
  • WordNet
  • EuroWordNet

17
MRD Resource for Knowledge based WSD
  • For each word in the language vocabulary, an MRD
    provides
  • A list of meanings
  • Definitions (for all word meanings)
  • Typical usage examples (for most word meanings)
  • A thesaurus adds
  • An explicit synonymy relation between word
    meanings
  • -WordNet synsets for the noun plant
  • 1. plant, works, industrial plant
  • 2. plant, flora, plant life

18
Contd
  • A semantic network adds
  • Hypernymy / hyponymy (IS-A), meronymy / holonymy
    (PART-OF), antonym, entailment, etc.
  • Word Net related concepts for the meaning
    plant life
  • plant, flora, plant life
  • hypernym organism, being
  • hypomym house plant,
    fungus, meronym
    plant tissue, plant part
  • holonym Plantae,
    kingdom Plantae, plant kingdom

19
Lesk Algorithm
  • (Michael Lesk 1986) Identify senses of words in
    context using definition overlap
  • Identify simultaneously the correct senses for
    all words in context
  • Algorithm
  • Retrieve from MRD all sense definitions of the
    words to be disambiguated
  • Determine the definition overlap for all possible
    sense combinations
  • Choose senses that lead to highest overlap

20
  • Example disambiguate PINE CONE
  • PINE
  • 1. kinds of evergreen tree with needle-shaped
    leaves
  • 2. waste away through sorrow or illness
  • CONE
  • 1. solid body which narrows to a point
  • 2. something of this shape whether solid or
    hollow
  • 3. fruit of certain evergreen trees

Pine1 ? Cone1 0 Pine2 ? Cone1 0 Pine1 ?
Cone2 1 Pine2 ? Cone2 0 Pine1 ? Cone3
2 Pine2 ? Cone3 0
21
Supervised Disambiguation
  • Supervised WSD Class of methods that induces a
    classifier from manually sense-tagged text using
    machine learning techniques.
  • Resources
  • Sense Tagged Text
  • Dictionary (implicit source of sense inventory)
  • Syntactic Analysis (POS tagger, Chunker, Parser,
    )
  • Scope
  • Typically one target word per context
  • Part of speech of target word resolved
  • Lends itself to targeted word formulation
  • Reduces WSD to a classification problem where a
    target word is assigned the most appropriate
    sense from a given set of possibilities based on
    the context in which it occurs

22
Supervised Methodology
  • Create a sample of training data where a given
    target word is manually annotated with a sense
    from a predetermined set of possibilities.
  • One tagged word per instance/lexical sample
    disambiguation
  • Select a set of features with which to represent
    context.
  • co-occurrences, collocations, POS tags, verb-obj
    relations, etc...
  • Convert sense-tagged training instances to
    feature vectors.

23
Contd
  • Apply a machine learning algorithm to induce a
    classifier.
  • Form structure or relation among features
  • Parameters strength of feature interactions
  • Convert a held out sample of test data into
    feature vectors.
  • correct sense tags are known but not used
  • Apply classifier to test instances to assign a
    sense tag.

24
Supervised Learning Algorithms
  • Once data is converted to feature vector form,
    any supervised learning algorithm can be used.
    Many have been applied to WSD with good results
  • Support Vector Machines
  • Nearest Neighbor Classifiers
  • Decision Trees
  • Decision Lists
  • Naïve Bayesian Classifiers
  • Neural Networks
  • Graphical Models
  • Log Linear Models

25
Unsupervised Disambiguation
  • Unsupervised Word Sense Discrimination A class
    of methods that cluster words based on similarity
    of context
  • Strong Contextual Hypothesis
  • (Miller and Charles, 1991) Words with similar
    meanings tend to occur in similar contexts
  • (Firth, 1957) You shall know a word by the
    company it keeps.
  • words that keep the same company tend to have
    similar meanings

26
  • Only use the information available in raw text,
    do not use outside knowledge sources or manual
    annotations
  • No knowledge of existing sense inventories, so
    clusters are not labelled with senses
  • Resources
  • Large Corpora
  • Scope
  • Typically one targeted word per context to be
    discriminated
  • Equivalently, measure similarity among contexts
  • Features may be identified in separate training
    data, or in the data to be clustered

27
Contd
  • Does not assign senses or labels to clusters
  • Word Sense Discrimination reduces WSD to the
    problem of finding the targeted words that occur
    in the most similar contexts and placing them in
    a cluster

28
Minimally Supervised WSD
  • Supervised WSD learning sense classifiers
    starting with annotated data
  • Minimally supervised WSD learning sense
    classifiers from annotated data, with minimal
    human supervision
  • Examples
  • Automatically bootstrap a corpus starting with a
    few human annotated examples
  • Use monosemous relatives / dictionary definitions
    to automatically construct sense tagged data
  • Rely on Web-users active learning for corpus
    annotation

29
Yarowskys Algorithm
  • Algorithm Details
  • Step1Store Word and its contexts as line
  • eg.zonal distribution of plant life..
  • Step2 Identify a few words that represent the
    word Sense
  • eg. plant(manufacturing/life)
  • Step3a Get rules from the training set
  • plant X gt A, weight
  • plant Y gt B, weight
  • Step3bUse the rules created in 3a to classify
    all occurrences of plant sample set.

30
Contd
  • Step3c Use one-sense-per-discourse rule to
    filter or augment this addition
  • Step3d Repeat Step 3 a-b-c iteratively.
  • Step4 the training converges on a stable
    residual set.
  • Step 5 the result will be a set of rules. Those
    rules will be used to disambiguate the word
    plant.
  • eg. plant growth gt life
  • plant car gt manufacturing

31
  • Definition.
  • Ambiguity for Humans and Computers.
  • Brief historical view.
  • What is word sense???
  • Approches or methods of WSD.
  • Application of WSD
  • Evaluation of WSD
  • Conclusion

32
Applications of WSD
  • Information Retrieval (IR)- Word sense ambiguity
    is one of the reasons for their poor performance
    of IR systems / Search engines.
  • Information Retrieval
  • Find all Web Pages about cricket
  • The sport or the insect?

33
  • Machine Translation-machine translation, in
    computational linguistics, publishing, and other
    fields, the use of computers to conduct
    large-scale translation operations.
  • Machine Translation
  • Translate bill from English to Hindi
  • Is it Vidhyak or a prapyak?

34
  • Information Extraction (IE) whose goal is to
    automatically extract structured information
  • Typical subtasks of IE
  • Content noise removal
  • Named entity recognition
  • Terminology extraction
  • Relationship extraction
  • PERSON works for ORGANIZATION (extracted from the
    sentence "Bill works for IBM.")
  • PERSON located in LOCATION (extracted from the
    sentence "Bill is in INDIA.")

35
  • Content Analysis-The analysis of the general
    content of a text in terms of its ideas, themes,
    etc.
  • Word processing-Word processing is a relevant
    application of natural language processing, whose
    importance has been recognized.
  • Lexicography-WSD can help provide empirical sense
    groupings and statistically significant
    indicators of context for new or existing senses.
  • Semantic web-semantic Web vision can potentially
    benefit from most of the above-mentioned
    applications, as it inherently needs
    domain-oriented and unrestricted sense
    disambiguation to deal with the semantics of
    (Web) documents, and enable interoperability
    between systems, ontologies, and users.

36
  • Definition.
  • What is word sense???
  • Ambiguity for Humans and Computers.
  • Brief historical view.
  • Approches or methods of WSD.
  • Application of WSD
  • Evaluation of WSD
  • Conclusion

37
Evaluation of WSD
  • Precision precision is the fraction of retrieved
    documents that are relevant to the search OR
    percentage of words that are tagged correctly,
    out of the words addressed by the system
  • Recall Recall in Information Retrieval is the
    fraction of the documents that are relevant to
    the query that are successfully retrieved OR
    percentage of words that are tagged correctly,
    out of all words in the test set
  • Ex.
  • Test set of 100 words Precision 50 / 75
    0.66
  • System attempts 75 words Recall 50 / 100
    0.50
  • Words correctly disambiguated 50

38
Contd
  • There are two kinds of test corpora
  • Lexical sample the occurrences of a small sample
    of target words need to be disambiguated, and
  • All-words all the words in a piece of running
    text need to be disambiguated.

39
SENSEVAL
  • SENSEVALuation
  • Senseval is run by small committee under the
    auspices of ACL-SIGLEX (the Special Interest
    Group on the LEXicon of the Association for
    Computational Linguistics) 
  • Evaluation of WSD systems http//www.senseval.org
  • Senseval 1 1999 about 10 teams
  • Senseval 2 2001 about 30 teams
  • Senseval 3 2004 about 55 teams
  • Senseval 4 2007(?)

40
  • Provides sense annotated data for many languages,
    for several tasks
  • Languages English, Romanian, Chinese, Basque,
    Spanish, etc.
  • Tasks Lexical Sample, All words, etc.
  • Provides evaluation software
  • Provides results of other participating systems

41
  • Definition.
  • What is word sense???
  • Ambiguity for Humans and Computers.
  • Brief historical view.
  • Approches or methods of WSD.
  • Application of WSD
  • Evaluation of WSD
  • Conclusion

42
Conclusion
  • Common and traditional characterization of WSD as
    an explicit and separate process of
    disambiguation with respect to a fixed inventory
    of word senses.
  • Words are typically assumed to have a finite and
    discrete set of senses, a gross simplification of
    the complexity of word meaning, as studied in
    lexical semantics.
  • While this characterization has been fruitful
    for research into WSD per se, it is somewhat at
    odds with what seems to be needed in real
    applications.

43
  • A sense inventory cannot be task-independent
  • Example- the ambiguity of mouse (animal or
    device) is not relevant in English-French machine
    translation, but is relevant in information
    retrieval.
  • Different algorithms for different applications
  • Completely different algorithms might be
    required by different applications. In machine
    translation, the problem takes the form of target
    word selection
  • Word meaning does not divide up into discrete
    senses
  • Word meaning is in principle infinitely
    variable and context sensitive

44
  • Lexicographers frequently discover in corpora
    loose and overlapping word meanings, and standard
    or conventional meanings extended, modulated, and
    exploited in a bewildering variety of ways.
  • The art of lexicography is to generalize from the
    corpus to definitions that evoke and explain the
    full range of meaning of a word, making it seem
    like words are well-behaved semantically

45
  • THANK YOU
Write a Comment
User Comments (0)
About PowerShow.com