Sentiment Analysis - PowerPoint PPT Presentation

About This Presentation
Title:

Sentiment Analysis

Description:

Sentiment Analysis An Overview of Concepts and Selected Techniques Terms Sentiment A thought, view, or attitude, especially one based mainly on emotion instead of ... – PowerPoint PPT presentation

Number of Views:1811
Avg rating:3.0/5.0
Slides: 16
Provided by: billc163
Category:

less

Transcript and Presenter's Notes

Title: Sentiment Analysis


1
Sentiment Analysis
  • An Overview of Concepts and Selected Techniques

2
Terms
  • Sentiment
  • A thought, view, or attitude, especially one
    based mainly on emotion instead of reason
  • Sentiment Analysis
  • aka opinion mining
  • use of natural language processing (NLP) and
    computational techniques to automate the
    extraction or classification of sentiment from
    typically unstructured text

3
Motivation
  • Consumer information
  • Product reviews
  • Marketing
  • Consumer attitudes
  • Trends
  • Politics
  • Politicians want to know voters views
  • Voters want to know policitians stances and who
    else supports them
  • Social
  • Find like-minded individuals or communities

4
Problem
  • Which features to use?
  • Words (unigrams)
  • Phrases/n-grams
  • Sentences
  • How to interpret features for sentiment
    detection?
  • Bag of words (IR)
  • Annotated lexicons (WordNet, SentiWordNet)
  • Syntactic patterns
  • Paragraph structure

5
Challenges
  • Harder than topical classification, with which
    bag of words features perform well
  • Must consider other features due to
  • Subtlety of sentiment expression
  • irony
  • expression of sentiment using neutral words
  • Domain/context dependence
  • words/phrases can mean different things in
    different contexts and domains
  • Effect of syntax on semantics

6
Approaches
  • Machine learning
  • Naïve Bayes
  • Maximum Entropy Classifier
  • SVM
  • Markov Blanket Classifier
  • Accounts for conditional feature dependencies
  • Allowed reduction of discriminating features from
    thousands of words to about 20 (movie review
    domain)
  • Unsupervised methods
  • Use lexicons

Assume pairwise independent features
7
LingPipe Polarity Classifier
  • First eliminate objective sentences, then use
    remaining sentences to classify document polarity
    (reduce noise)

8
LingPipe Polarity Classifier
  • Uses unigram features extracted from movie review
    data
  • Assumes that adjacent sentences are likely to
    have similar subjective-objective (SO) polarity
  • Uses a min-cut algorithm to efficiently extract
    subjective sentences

9
LingPipe Polarity Classifier
Graph for classifying three items.
10
LingPipe Polarity Classifier
  • Accurate as baseline but uses only 22 of content
    in test data (average)
  • Metrics suggests properties of movie review
    structure

11
SentiWordNet
  • Based on WordNet synsets
  • http//wordnet.princeton.edu/
  • Ternary classifier
  • Positive, negative, and neutral scores for each
    synset
  • Provides means of gauging sentiment for a text

12
SentiWordNet Construction
  • Created training sets of synsets, Lp and Ln
  • Start with small number of synsets with
    fundamentally positive or negative semantics,
    e.g., nice and nasty
  • Use WordNet relations, e.g., direct antonymy,
    similarity, derived-from, to expand Lp and Ln
    over K iterations
  • Lo (objective) is set of synsets not in Lp or Ln
  • Trained classifiers on training set
  • Rocchio and SVM
  • Use four values of K to create eight classifiers
    with different precision/recall characteristics
  • As K increases, P decreases and R increases

13
SentiWordNet Results
  • 24.6 synsets with Objectivelt1.0
  • Many terms are classified with some degree of
    subjectivity
  • 10.45 with Objectivelt0.5
  • 0.56 with Objectivelt0.125
  • Only a few terms are classified as definitively
    subjective
  • Difficult (if not impossible) to accurately
    assess performance

14
SentiWordNet How to use it
  • Use score to select features (/-)
  • e.g. Zhang and Zhang (2006) used words in corpus
    with subjectivity score of 0.5 or greater
  • Combine pos/neg/objective scores to calculate
    document-level score
  • e.g. Devitt and Ahmad (2007) conflated polarity
    scores with a Wordnet-based graph representation
    of documents to create predictive metrics

15
References
  • http//www.answers.com/sentiment, 9/22/08
  • B. Pang, L. Lee, and S. Vaithyanathan, Thumbs
    up? Sentiment classification using machine
    learning techniques, in Proc Conf on Empirical
    Methods in Natural Language Processing (EMNLP),
    pp. 7986, 2002.
  • Esuli A, Sebastiani F. SentiWordNet A Publicly
    Available Lexical Resource for Opinion Mining.
    In Proc of LREC 2006 - 5th Conf on Language
    Resources and Evaluation, 2006.
  • Zhang E, Zhang Y. UCSC on TREC 2006 Blog Opinion
    Mining. TREC 2006 Blog Track, Opinion Retrieval
    Task.
  • Devitt A, Ahmad K. Sentiment Polarity
    Identification in Financial News A
    Cohesion-based Approach. ACL 2007.
  • Bo Pang , Lillian Lee, A sentimental education
    sentiment analysis using subjectivity
    summarization based on minimum cuts, Proceedings
    of the 42nd Annual Meeting on Association for
    Computational Linguistics, p.271-es, July 21-26,
    2004.
Write a Comment
User Comments (0)
About PowerShow.com