Title: Sentiment and Opinion
1Sentiment and Opinion
- Sept 4, 2007
- Analysis of Social Media Seminar
- William Cohen
2Announcements
- First few classes will be lectures cover some
background - Some tools commonly used in analysis of social
media - Some ideas that have been widely explored in
social media - So, today tools for sentiment and opinion!
- Next weeks class will give some background on
graph analysis - The Web as a graph Measurements, models and
methods Kleinberg et al, Invited survey at the
International Conference on Combinatorics and
Computing, 1999. - The PageRank citation ranking Bringing order to
the Web, Page et al,1999. - Will start splitting time with students soon
- Enrolled students expect to lead ½ a meeting
- Inspire discussion!
3Manual and Automatic Subjectivity and Sentiment
Analysis
Content cheerfully pilfered from this 250slide
tutorial EUROLAN SUMMER SCHOOL 2007, Semantics,
Opinion and Sentiment in Text, July 23-August 3,
University of Iasi, Romania http//www.cs.pitt.edu
/wiebe/tutorialsExtendedTalks.html
- Jan Wiebe
- Josef Ruppenhofer
- Swapna Somasundaran
- University of Pittsburgh
4Some sentences expressing opinion or something
a lot like opinion
- Wow, this is my 4th Olympus camera.
- Most voters believe that he's not going to raise
their taxes. - The United States fears a spill-over from the
anti-terrorist campaign. - We foresaw electoral fraud but not daylight
robbery, Tsvangirai said.
5One motivation Opinion Question Answering
Q What is the international reaction to the
reelection of Robert Mugabe as President of
Zimbabwe?
A African observers generally approved of his
victory while Western Governments denounced it.
6More motivations
- Product review mining What features of the
ThinkPad T43 do customers like and which do they
dislike? - Review classification Is a review positive or
negative toward the movie? - Tracking sentiments toward topics over time Is
anger ratcheting up or cooling down? - Etc.
These are all ways to summarize one sort of
content that is common on blogs, bboards,
newsgroups, etc. W
7First, some early influential papers on opinion
8Turneys paper
- Goal classify reviews as positive or
negative. - Epinions not recommended as given by authors.
- Method
- Find (possibly) meaningful phrases from review
(e.g., bright display, inspiring lecture, ) - Estimate semantic orientation of each candidate
phrase (based on POS patterns, like ADJ NOUN) - Assign overall orentation of review by averaging
orentation of the phrases in the review
9Semantic orientation (SO) of phrases
10(No Transcript)
11(No Transcript)
12Williams picture of Jans picture of this paper
excellent,poor
Separate corpus
Distributional similarity
Seeds
Altavista
(appear in same contexts)
Review
13Key ideas in Turney 2002
- Simplification
- classify an entire document, not a piece of it.
(Many reviews are mixed.) - Focus on what seems important
- Extract semantically oriented words/phrases from
the document. (Phrases are less ambiguous than
words eg Even poor students will learn a lot
from this lecture). - Bootstrapping/semi-supervised learning
- To assess orientation of phrases, use some kind
of contextual similarity of phrases
14Pang et al EMNLP 2002
15Methods
- Method one count human-provided polar words
(sort of like Turney) - Eg, love, wonderful, best, great, superb, still,
beautiful vs bad, worst, stupid, waste, boring,
?, ! gives 69 accuracy on 700/700- movie
reviews - Method two plain ol text classification
- Eg, Naïve Bayes bag of words 78.7 SVM-lite set
of words 82.9 was best result - Followup work (ACL 2004) improves by
- Classifying based on the most subjective
sentences - Using discourse (proximity) to help predict
subjectivity
16Pang, Lee, Vaithyanathan EMNLP 2002
A different approach
- Movie review classification using Naïve Bayes,
Maximum Entropy, SVM - Results do not reach levels achieved in topic
categorization - Various feature combinations (unigram, bigram,
POS, text position) - Unigram presence works best
- Challengediscourse structure
17Manual and Automatic Subjectivity and Sentiment
Analysis
- Jan Wiebe
- Josef Ruppenhofer
- Swapna Somasundaran
- University of Pittsburgh
18Everyone knows that dragons don't exist. But
while this simplistic formulation may satisfy the
layman, it does not suffice for the scientific
mind. The School of Higher Neantical Nillity is
in fact wholly unconcerned with what does exist.
Indeed, the banality of existence has been so
amply demonstrated, there is no need for us to
discuss it any further here. The brilliant
Cerebron, attacking the problem analytically,
discovered three distinct kinds of dragon the
mythical, the chimerical, and the purely
hypothetical. They were all, one might say,
nonexistent, but each nonexisted in an entirely
different way... - Stanislaw Lem, The Cyberiad
19Preliminaries
- What do we mean by subjectivity?
- The linguistic expression of somebodys emotions,
sentiments, evaluations, opinions, beliefs,
speculations, etc. - Wow, this is my 4th Olympus camera.
- Staley declared it to be one hell of a
collection. - Most voters believe that he's not going to raise
their taxes
20Corpus AnnotationWiebe, Wilson, Cardie
2005Annotating Expressions of Opinions and
Emotions in Language
Leaving aside whats possible, what sort of
inferences about sentiment, opinion, etc would we
like to be able to make?
21Overview
- Fine-grained expression-level rather than
sentence or document level - The photo quality was the best that I have seen
in a camera. - The photo quality was the best that I have seen
in a camera. - Annotate
- expressions of opinions, evaluations, emotions
- material attributed to a source, but presented
objectively
22Overview
- Fine-grained expression-level rather than
sentence or document level - The photo quality was the best that I have seen
in a camera. - The photo quality was the best that I have seen
in a camera. - Annotate
- expressions of opinions, evaluations, emotions,
beliefs - material attributed to a source, but presented
objectively
23Overview
- Opinions, evaluations, emotions, speculations are
private states. - They are expressed in language by subjective
expressions.
Private state state that is not open to
objective observation or verification.
Quirk, Greenbaum, Leech, Svartvik (1985). A
Comprehensive Grammar of the English Language.
24Overview
- Focus on three ways private states are expressed
in language - Direct subjective expressions
- Expressive subjective elements
- Objective speech events
25Direct Subjective Expressions
- Direct mentions of private states
- The United States fears a spill-over from the
anti-terrorist campaign. - Private states expressed in speech events
- We foresaw electoral fraud but not daylight
robbery, Tsvangirai said.
This implies a private state
26Expressive Subjective Elements Banfield 1982
- We foresaw electoral fraud but not daylight
robbery, Tsvangirai said - The part of the US human rights report about
China is full of absurdities and fabrications
We foresaw difficulties with the electoral
process but not to this extent, Tsvangirai
said. The part of the US human rights report
about China contains many statements that we were
unable to verify.
27Objective Speech Events
- Material attributed to a source, but presented as
objective fact - The government, it added, has amended the
Pakistan Citizenship Act 10 of 1951 to enable
women of Pakistani descent to claim Pakistani
nationality for their children born to foreign
husbands.
What does this have to do with opinion? You need
it to sort out who has opinions about what -W
28Nested Sources
(Writer)
29Nested Sources
(Writer, Xirao-Nima)
30Nested Sources
(Writer Xirao-Nima)
(Writer Xirao-Nima)
31The report is full of absurdities, Xirao-Nima
said the next day.
Objective speech event anchor the entire
sentence source implicit true
Attributes The anchor is the linguistic
expressionthe stretch of textthat tells us that
there is a private state. Where to hang the
annotation -W The source is the person to whom
the private state is attributed. Note that this
can be a chain of people. The target is the
content of the private state or what the private
state is about. Attitude type If not specified,
it is to be understood as neutral but can be set
to positive or negative as required. Intensity
records the intensity of the private state as a
whole. What? W
Direct subjective anchor said source
intensity high
expression intensity neutral attitude type
negative target report
Expressive subjective element anchor full of
absurdities source
intensity high attitude type negative
32Corpus
- www.cs.pitt.edu/mqpa/databaserelease (version 2)
- English language versions of articles from the
world press (187 news sources) - Themes of the instructions
- No rules about how particular words should be
annotated. - Dont take expressions out of context and think
about what they could mean, but judge them as
they are used in that sentence. - Kappa around 0.7 0.8.
33More reasons for fine-grain annotation and
analysis
- Turney Pang et al document D is about a known
product PD, sentiment refers to PD. Life is more
complicated - The part of the US human rights report about
China is full of absurdities and fabrications - What is absurd fabricated? The part, the US,
the report, or China? - For sentiment about products we want to know what
is good or bad there are usually tradeoffs - Huge screen ? very heavy
- Very fast ? really expensive
34And more
35Demos
1) opinmind.com searches for positive/negative
sentiments about search termsample
queriesiphonegoogle vs. microsoft
http//www.opinmind.com/search.jsp?qgooglevsmic
rosoft cmu vs. stanford
36Demos
2) opine (Ana-Maria Popescu, Bao Nguyen, Oren
Etzioni)sentiment-feature labeling of hotel
reviewshttp//www.cs.washington.edu/research/kno
witall/opine/ new yorknew york, attributebed
37Demos
3) OASYSsentiment analysis of news sources,
sliced by country and sourcehttp//oasys.umiacs.
umd.edu/oasysnew/oasys.php login/password
guest3(run w/allow pop-ups)does anaphora
resolutionsample queries Musharraf, Karzai,
Apple, Dell
38Demos
4) TextMapentity sentiment over news
blogsnews http//www.textmap.com blog
http//www.textblg.com http//www.icwsm.org/paper
s/3--Godbole-Srinivasaiah-Skiena.pdf Daily
sentiment reportTop entitiesHeatmaps
39Demos
5) Moodviewshttp//ilps.science.uva.nl/MoodViews
/ Moodteller - predict aggregate mood from
textMoodspotter - explain discrepancies between
predicted and actual aggregate mood
40And more
- Quick overview of some of Jans other slides
41Dave, Lawrence, Pennock 2003Mining the Peanut
Gallery Opinion Extraction and Semantic
Classification of Product Reviews
- Product-level review-classification
- Train Naïve Bayes classifier using a corpus of
self-tagged reviews available from major web
sites (Cnet, amazon) - Refine the classifier using the same corpus
before evaluating it on sentences mined from
broad web searches
42Hu Liu 2004Mining Opinion Features in Customer
Reviews
- Here explicit product features only, expressed
as nouns or compound nouns - Use association rule mining technique rather than
symbolic or statistical approach to terminology - Extract associated items (item-sets) based on
support (1)
43Yi Niblack 2005Sentiment mining in WebFountain
44Takamura et al. 2007Extracting Semantic
Orientations of Phrases from Dictionary
- Use a Potts model to categorize AdjNoun phrases
- Targets ambiguous adjectives like low, high,
small, large - Connect two nouns, if one appears in gloss of
other - Nodes have orientation values (pos, neg, neu) and
are connected by same or different orientation
links
45Popescu Etzioni 2005
- Report on a product review mining system that
extracts and labels opinion expressions their
attributes - They use the relaxation-labeling technique from
computer vision to perform unsupervised
classification satisfying local constraints
(which they call neighborhood features) - The system tries to solve several classification
problems (e.g. opinion and target finding) at the
same time rather than separately.
46Blog analysis
- Analysis of sentiments on Blog posts
- Chesley et al.(2006)
- Perform subjectivity and polarity classification
on blog posts - Sentiment has been used for blog analysis
- Balog et al. (2006)
- Discover irregularities in temporal mood patterns
(fear, excitement, etc) appearing in a large
corpus of blogs - Kale et al. (2007)
- Use link polarity information to model trust and
influence in the blogosphere - Blog sentiment has been used in applications
- Mishne and Glance (2006)
- Analyze Blog sentiments about movies and
correlate it with its sales
47Trends Buzz
- Stock market
- Koppel Shtrimberg(2004)
- Correlate positive/negative news stories about
publicly traded companies and the stock price
changes - Market Intelligence from message boards, forums,
blogs. - Glance et al. (2005)
48Bethard et al. 2004Automatic Extraction of
Opinion Propositions and their Holders
- Find verbs that express opinions in propositional
form, and their holders - Still, Vista officials realize theyre relatively
fortunate. - Modify algorithms developed in earlier work on
semantic parsing to perform binary classification
(opinion or not) - Use presence of subjectivity clues to identify
opinionated uses of verbs
49Choi et al.2005Identifying sources of opinions
with conditional random fields and extraction
patterns
- Treats source finding as a combined sequential
tagging and information extraction task - IE patterns are high precision, lower recall
- Base CRF uses information about noun phrase
semantics, morphology, syntax - IE patterns connect opinion words to sources
- Conditional Random Fields given IE features
perform better than CRFs alone
50Kim Hovy 2006Extracting opinions, opinion
holders, andtopics expressed in online news
media text
- Perform semantic role labeling (FrameNet) for a
set of adjectives and verbs (pos, neg) - Map semantic roles to holder and target
- E.g. for Desiring frame Experiencer-Holder
- Train on FN data, test on FN data and on news
sentences collected and annotated by authors
associates - Precision is higher for topics, recall for
holders
51Choi, Breck, Cardie 2006 Joint extraction of
entities and relations for opinion reocgnition
- Find direct expressions of opinions and their
sources jointly - Uses sequence-tagging CRF classifiers for opinion
expressions, sources, and potential link
relations - Integer linear programming combines local
knowledge and incorporates constraints - Performance better even on the individual tasks
522007 NLP papersNAACL
- N07-1037 bib Hiroya Takamura Takashi Inui
Manabu OkumuraExtracting Semantic Orientations
of Phrases from Dictionary - N07-1038 bib Benjamin Snyder Regina
BarzilayMultiple Aspect Ranking Using the Good
Grief Algorithm - N07-1039 bib Kenneth Bloom Navendu Garg
Shlomo ArgamonExtracting Appraisal Expressions
532007 NLP PapersACL 1
- P07-1053 bib Anindya Ghose Panagiotis
Ipeirotis Arun SundararajanOpinion Mining using
Econometrics A Case Study on Reputation Systems - P07-1054 bib Andrea Esuli Fabrizio
SebastianiPageRanking WordNet Synsets An
Application to Opinion Mining - P07-1055 bib Ryan McDonald Kerry Hannan
Tyler Neylon Mike Wells Jeff ReynarStructured
Models for Fine-to-Coarse Sentiment Analysis - P07-1056 bib John Blitzer Mark Dredze
Fernando PereiraBiographies, Bollywood,
Boom-boxes and Blenders Domain Adaptation for
Sentiment Classification
542007 NLP PapersACL 2
- P07-1123 bib Rada Mihalcea Carmen Banea
Janyce WiebeLearning Multilingual Subjective
Language via Cross-Lingual Projections - P07-1124 bib Ann Devitt Khurshid
AhmadSentiment Polarity Identification in
Financial News A Cohesion-based Approach - P07-1125 bib Ben Medlock Ted BriscoeWeakly
Supervised Learning for Hedge Classification in
Scientific Literature
552007 NLP PapersEMNLP
- D07-1113 bib Soo-Min Kim Eduard HovyCrystal
Analyzing Predictive Opinions on the Web - D07-1114 bib Nozomi Kobayashi Kentaro Inui
Yuji MatsumotoExtracting Aspect-Evaluation and
Aspect-Of Relations in Opinion Mining - D07-1115 bib Nobuhiro Kaji Masaru
KitsuregawaBuilding Lexicon for Sentiment
Analysis from Massive Collection of HTML
Documents
56Bibliographies
- Bibliography of papers in this tutorial
- www.cs.pitt.edu/wiebe/eurolan07.bib
- www.cs.pitt.edu/wiebe/eurolan07.html
- Andrea Esulis extensive Sentiment
Classification bibliography (not limited to
sentiment or classification) - http//liinwww.ira.uka.de/bibliography/Misc/Sentim
ent.html
57Yahoo! Group
- SentimentAI
- http//tech.groups.yahoo.com/group/SentimentAI/