Title: Document-level Semantic Orientation and Argumentation
1Document-level Semantic Orientation and
Argumentation
- Presented by Marta Tatu
- CS7301
- March 15, 2005
2? or ?? Semantic Orientation Applied to
Unsupervised Classification of Reviews
3Overview
- Unsupervised learning algorithm for classifying
reviews as recommended or not recommended - The classification is based on the semantic
orientation of the phrases in the review which
contain adjectives and adverbs
4Algorithm
- Input review
- Identify phrases that contain adjectives or
adverbs by using a part-of-speech tagger - Estimate the semantic orientation of each phrase
- Assign a class to the given review based on the
average semantic orientation of its phrases - Output classification (? or ?)
5Step 1
- Apply Brills part-of-speech tagger on the review
- Adjective are good indicators of subjective
sentences. In isolation - unpredictable steering (?) / plot (?)
- Extract two consecutive words one is an
adjective or adverb, the other provides the
context
First Word Second Word Third Word (not extracted)
1. JJ NN or NNS Anything
2. RB, RBR, or RBS JJ Not NN nor NNS
3. JJ JJ Not NN nor NNS
4. NN or NNS JJ Not NN nor NNS
5. RB, RBR, or RBS VB, VBD, VBN, or VBG Anything
6Step 2
- Estimate the semantic orientation of the
extracted phrases using PMI-IR (Turney, 2001) - Pointwise Mutual Information (Church and Hanks,
1989) - Semantic Orientation
- PMI-IR estimates PMI by issuing queries to a
search engine (Altavista, 350 million pages)
7Step 2 continued
- Added 0.01 to hits to avoid division by zero
- If hits(phrase NEAR excellent) and hits(phrase
NEAR poor)4, then eliminate phrase - Added AND (NOT hostepinions) to the queries
not to include the Epinions website
8Step 3
- Calculate the average semantic orientation of the
phrases in the given review - If the average is positive, then ?
- If the average is negative, then ?
Phrase POS tags SO
direct deposit JJ NN 1.288
local branch JJ NN 0.421
small part JJ NN 0.053
online service JJ NN 2.780
well other RB JJ 0.237
low fees JJ NNS 0.333
true service JJ NN -0.732
other bank JJ NN -0.850
inconveniently located RB VBN -1.541
Average Semantic Orientation Average Semantic Orientation 0.322
9Experiments
- 410 reviews from Epinions
- 170 (41) (?)
- 240 (59) (?)
- Average phrases per review 26
- Baseline accuracy 59
Domain Accuracy Correlation
Automobiles 84.00 0.4618
Banks 80.00 0.6167
Movies 65.83 0.3608
Travel Destinations 70.53 0.4155
All 74.39 0.5174
10Discussion
- What makes the movies hard to classify?
- The average SO tends to classify a recommended
movies as not recommended - Evil characters make good movies
- The whole is not necessarily the sum of the parts
- Good beaches do not necessarily add up to a good
vacation - But good automobile parts usually add up to a
good automobile
11Applications
- Summary statistics for search engines
- Summarization of reviews
- Pick out the sentence with the highest
positive/negative semantic orientation given a
positive/negative review - Filtering flames for newsgroups
- When the semantic orientation drops below a
threshold, the message might be a potential flame
12Questions ?
- Comments ?
- Observations ?
13?? Sentiment Classification using Machine
Learning Techniques
- Bo Pang, Lillian Lee and Shivakumar Vaithyanathan
- EMNLP-2002
14Overview
- Consider the problem of classifying documents by
overall sentiment - Three machine learning methods besides the
human-generated lists of words - Naïve Bayes
- Maximum Entropy
- Support Vector Machines
15Experimental Data
- Movie-review domain
- Source Internet Movie Database (IMDb)
- Stars or numerical value ratings converted into
positive, negative, or neutral no need to hand
label the data for training or testing - Maximum of 20 reviews/author/sentiment category
- 752 negative reviews
- 1301 positive reviews
- 144 reviewers
16List of Words Baseline
- Maybe there are certain words that people tend to
use to express strong sentiments - Classification done by counting the number of
positive and negative words in the document - Random-choice baseline 50
17Machine Learning Methods
- Bag-of-features framework
- f1,,fm predefined set of m features
- ni(d) number of times fi occurs in document d
-
- (Naïve
Bayes)
18Machine Learning Methods continued
- (Maximum Entropy)
- where Fi,c is a feature/class function
- Support vector machines Find hyperplane that
maximizes the margin. The constraint optimization
problem -
- cj is the correct class of document dj
19Evaluation
- 700 positive-sentiment and 700 negative-sentiment
documents - 3 equal-sized folds
- The tag NOT_ was added to every word between a
negation word (not, isnt, didnt) and the
first punctuation mark - good is opposite to not very good
- Features
- 16,165 unigrams appearing at least 4 times in the
1400-document corpus - 16,165 most often occurring bigrams in the same
data
20Results
- POS information added to differentiate between
I love this movie and This is a love story
21Conclusion
- Results produced by the machine learning
techniques are better than the human-generated
baselines - SVMs tend to do the best
- Unigram presence information is the most
effective - Frequency vs. presence thwarted expectation,
many words indicative of the opposite sentiment
to that of the entire review - Some form of discourse analysis is necessary
22Questions ?
- Comments ?
- Observations ?
23Summarizing Scientific Articles Experiments with
Relevance and Rhetorical Status
- Simone Teufel and Marc Moens
- CL-2002
24Overview
- Summarization of scientific articles restore the
discourse context of extracted material by adding
the rhetorical status of each sentence in the
document - Gold standard data for summaries consisting of
computational linguistics articles annotated with
the rhetorical status and relevance for each
sentence - Supervised learning algorithm which classifies
sentences into 7 rhetorical categories
25Why?
- Knowledge about the rhetorical status of the
sentence enables the tailoring of the summaries
according to users expertise and task - Nonexpert summary background information and the
general purpose of the paper - Expert summary no background, instead
differences between this approach and similar
ones - Contrasts or complementarity among articles can
be expressed
26Rhetorical Status
- Generalizations about the nature of scientific
texts information to enable the construction of
better summaries - Problem structure problems (research goals),
solutions (methods), and results - Intellectual attribution what the new
contribution is, as opposed to previous work and
background (generally accepted statements) - Scientific argumentation
- Attitude toward other peoples work rival
approach, prior approach with a fault, or an
approach contributing parts of the authors own
solution
27Metadiscourse and Agentivity
- Metadiscourse is an aspect of scientific
argumentation and a way of expressing attitude
toward previous work - we argue that, in contrast to common belief,
we - Agent roles in argumentation rivals,
contributors of part of the solution (they), the
entire research community, or the authors of the
paper (we)
28Citations and Relatedness
- Just knowing that an article cites another is
often not enough - One needs to read the context of the citation to
understand the relation between the articles - Article cited negatively or contrastively
- Article cited positively or in which the authors
state that their own work originates from the
cited work
29Rhetorical Annotation Scheme
- Only one category assigned to each full sentence
- Nonoverlapping, nonhierarchical scheme
- The rhetorical status is determined on the basis
of the global context of the paper
30Relevance
- Select important content from text
- Highly subjective low human agreement
- Sentence is considered relevant if it describes
the research goal or states a difference with a
rival approach - Other definitions relevant sentence if it shows
a high level of similarity with a sentence in the
abstract
31Corpus
- 80 conference articles
- Association for Computational Linguistics (ACL)
- European Chapter of the Association for
Computational Linguistics (EACL) - Applied Natural Language Processing (ANLP)
- International Joint Conference on Artificial
Intelligence (IJCAI) - International Conference on Computational
Linguistics (COLING). - XML markups added
32The Gold Standard
- 3 tasked-trained annotators
- 17 pages of guidelines
- 20 hours of training
- No communication between annotators
- Evaluation measures of the annotation
- Stability
- Reproducibility
33Results of Annotation
- Kappa coefficient K (Siegel and Castellan, 1988)
- where P(A) pairwise agreement and P(E) random
agreement - Stability K.82, .81, .76 (N1,220 and k2)
- Reproducibility K.71
34The System
- Supervised machine learning Naïve Bayes
35Features
- Absolute location of a sentence
- Limitations of the authors own method can be
expected to be found toward the end, while
limitations of other researchers work are
discussed in the introduction
36Features continued
- Section structure relative and absolute position
of sentence within section - First, last, second or third, second-last or
third-last, or either somewhere in the first,
second, or last third of the section - Paragraph structure relative position of
sentence within a paragraph - Initial, medial, or final
37Features continued
- Headlines type of headline of current section
- Introduction, Implementation, Example,
Conclusion, Result, Evaluation, Solution,
Experiment, Discussion, Method, Problems, Related
Work, Data, Further Work, Problem Statement, or
Non-Prototypical - Sentence length
- Longer or shorter than 12 words (threshold)
38Features continued
- Title word contents does the sentence contain
words also occurring in the title? - TFIDF word contents
- High values to words that occur frequently in one
document, but rarely in the overall collection of
documents - Do the 18 highest-scoring TFIDF words belong to
the sentence? - Verb syntax voice, tense, and modal linguistic
features
39Features continued
- Citation
- Citation (self), citation (other), author name,
or none location of the citation in the
sentence (beginning, middle, or end) - History most probable previous category
- AIM tends to follow CONTRAST
- Calculated as a second pass process during
training
40Features continued
- Formulaic expressions list of phrases described
by regular expressions, divided into 18 classes,
comprising a total of 644 patterns - Clustering prevents data sparseness
41Features continued
- Agent 13 types, 167 patterns
- The placeholder WORK_NOUN can be replaced by a
set of 37 nouns including theory, method,
prototype, algorithm - Agent classes with a distribution very similar
with the overall distribution of target
categories were excluded
42Features continued
- Action 365 verbs clustered into 20 classes based
on semantic concepts such as similarity, contrast - PRESENTATION_ACTIONs present, report, state
- RESEARCH_ACTIONs analyze, conduct, define, and
observe - Negation is considered
43System Evaluation
44Feature Impact
- The most distinctive single feature is Location,
followed by SegAgent, Citations, Headlines, Agent
and Formulaic
45Questions ?
- Comments ?
- Observations ?
46Thank You !