Evaluation of Utility of LSA for Word Sense Discrimination - PowerPoint PPT Presentation

About This Presentation
Title:

Evaluation of Utility of LSA for Word Sense Discrimination

Description:

Distance measure. L1: City Block. L2: Squared Euclidean. Cosine. 10. Sense-based Clusters ... Points can belong one cluster or another. Points belong to wrong cluster ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 24
Provided by: csC76
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: Evaluation of Utility of LSA for Word Sense Discrimination


1
Evaluation of Utility of LSA for Word Sense
Discrimination
  • Esther Levin, Mehrbod Sharifi, Jerry Ball
  • http//www-cs.ccny.cuny.edu/esther/research/lsa/

2
Outline
  • Latent Semantic Analysis (LSA)
  • Word sense discrimination through Context Group
    Discrimination Paradigm
  • Experiments
  • Sense-based clusters (supervised learning)
  • K-means clustering (unsupervised learning)
  • Homonyms vs. Polysemes
  • Conclusions

3
Latent Semantic Analysis (LSA)Deerwester 90
  • Represents words and passages as vectors in the
    same (low-dimensional) semantic space
  • Similarity in word meaning is defined by
    similarity of their contexts.

4
LSA Steps
  • Document-Term Co-occurrence Matrix
  • e.g., 1151 documents
  • X 5793 terms
  1. Compute SVD
  1. Reduce dimension by taking k largest singular
    values
  1. Compute the new vector representations for
    documents
  1. Our Research Clustering the new context vectors

5
Context Group Discrimination ParadigmShutze 98
  • Inducing senses of ambiguous words from their
    contextual similarity

Context Vectors of an ambiguous word
6
Context Group Discrimination ParadigmShutze 98
  • 1. Cluster the context vectors

2. Compute the centroids (sense vectors)
a lt b
7
Experiments
8
Experimental Setup
  • Corpus Leacock 93
  • Line (3 senses 1151 instances)
  • Hard (2 senses 752 instances)
  • Serve (2 senses 1292 instances)
  • Interest (3 senses 2113 instances)
  • Context size full document (small paragraph)
  • Number of clusters Number of senses

9
Research Objective
  • How well the different senses of ambiguous words
    are separated in the LSA-based vector space.
  • Parameters
  • Dimensionality of LSA representation
  • Distance measure
  • L1 City Block
  • L2 Squared Euclidean
  • Cosine

10
Sense-based Clusters
  • An instance of supervised learning
  • An upper bound on unsupervised performance of
    K-means or EM
  • Not influenced by the choice of clustering
    algorithm

11
Sense-based Clusters Accuracy
  • Training Finding sense vectors based on 90 of
    data
  • Testing Assigning the 10 remaining data to the
    closest sense vectors and evaluate by comparing
    this assignment to sense tags
  • Random selection, cross validation

12
Evaluating Clustering QualityTightness and
Separation
  • Dispersion Inter-cluster (K-Means minimizes)
  • Silhouette Intra-cluster

a(i) average distance of point i to all other
points in the same cluster b(i) average distance
of point i to the points in closest cluster
13
More on Silhouette Value
a(i) average of all blue lines b(i) average of
all yellow lines
14
Evaluating Clustering QualityTightness and
Separation
Average Silhouette Value
Cosine 0.9639 L1 0.7355 L2 0.9271
Cosine -0.0876 L1 -0.0504 L2 -0.0879
15
Sense-based ClustersDiscrimination Accuracy
Baseline Percentage of the majority sense
16
Sense-based ClustersAverage Silhouette Value
17
Sense-based ClustersResults
  • Good discrimination accuracy
  • Low silhouette value
  • How is that possible?

18
Unsupervised Learning with K-means
  • Cosine measure

19
Unsupervised Learning with K-means
20
Polysemes vs. Homonyms
  • Polysemes words with multiple related meanings
  • Homonyms words with the same spelling but
    completely different meaning

21
Pseudo Words as HomonymsShutze 98
22
Polysemes vs. Homonyms In LSA Space
  • The correlation between compactness of clusters
    and discrimination accuracy is higher for
    homonyms than polysemes

23
Conclusions
  • Good unsupervised sense discrimination
    performance for homonyms
  • Major deterioration in sense discrimination of
    polysemes in absence of supervision
  • Dimensionality reduction benefit is computational
    only (no peak in performance)
  • Cosine measure performs better than L1 and L2
Write a Comment
User Comments (0)
About PowerShow.com