Sentiment Classification using Word Sub-Sequences and Dependency Sub-Trees Pacific-Asia Knowledge Discovery and Data Mining May, 18th-20th, 2005 Shotaro Matsumoto, Hiroya Takamura and Manabu Okumura Tokyo Institute of Technology - PowerPoint PPT Presentation

About This Presentation

Title:

Sentiment Classification using Word Sub-Sequences and Dependency Sub-Trees Pacific-Asia Knowledge Discovery and Data Mining May, 18th-20th, 2005 Shotaro Matsumoto, Hiroya Takamura and Manabu Okumura Tokyo Institute of Technology

Description:

Sentiment Classification. using Word Sub-Sequences. and ... Document sentiment classification ... Positive( ) weight shows positive sentiment polarity ... – PowerPoint PPT presentation

Number of Views:144

Avg rating:3.0/5.0

Slides: 32

Provided by: lrPiTi

Category:

more less

Transcript and Presenter's Notes

Title: Sentiment Classification using Word Sub-Sequences and Dependency Sub-Trees Pacific-Asia Knowledge Discovery and Data Mining May, 18th-20th, 2005 Shotaro Matsumoto, Hiroya Takamura and Manabu Okumura Tokyo Institute of Technology

1
Sentiment Classificationusing Word
Sub-Sequencesand Dependency Sub-TreesPacific-As
ia Knowledge Discovery and Data MiningMay,
18th-20th, 2005Shotaro Matsumoto, Hiroya
Takamura and Manabu OkumuraTokyo Institute of
Technology
2
Table of Contents

1. Motivation
2. Our Approach
3. Experiments
4. Result and Discussion
5. Conclusion and Future Work

3
Table of Contents

Motivation
Background
Document Sentiment Classification
Early Studies
Issue
Objective
2. Our Approach
3. Experiments
4. Results and Discussions
5. Conclusion and Future Work

4
Background

Online grass-roots reviews are rapidly increasing
Contain useful reputation
There are so many such documents that we cannot
read them all
Mining reputation from such documents is
important

5
Document sentiment classification

a task to classify an overall document according
to the positive or negative polarity of its
opinion(desirable or undesirable)

6
Two steps for the classification

Feature extraction
convert a document to a feature vector, which
preserves features of the original document
2. Binary classification
Classify the feature vector to positive or
negative sentiment polarity

7
Early Studies

Pang 02
Features unigrams in the document
Classifier Naïve Bayes, ME Model, Support Vector
Machines (SVMs)
Showed SVMs is superior to others
Pang 04
Features unigrams obtained from the summary
Classifier SVMs
Mullen 04
Features unigrams, unigrams of lemmatized
words, prior knowledge from Internet and
thesaurus
Classifier SVMs
Get better results than Pang 02

8
Issue

Features in early studies
A document is represented as a bag-of-words,where
a text is regarded as a set of words ?
Word order and syntactic relations between words
in a sentence, intuitively important for the
classification, are discarded

9
Objective

We propose a method for extracting word order and
syntactic relations as features.
We use frequent sub-patterns in sentences as
these features.

10
Table of Contents

1. Motivation
2. Our Approach
Overview
Word Sub-Sequence
Dependency Sub-Tree
Frequent Sub-pattern
3. Experiments
4. Result and Discussion
5. Conclusion and Future Work

11
Overview

We use a word sequence and a dependency tree as
structured representations of a sentence
We extract frequent sub-patterns from sentences
as features for the classification

12
Word Sub-Sequence

A word sequence S
Just a sequence of words which represents a
sentence
preserves word order in a sentence
A word sub-sequence S of a word sequence S
Obtained by removing zero or more words from the
original sequence
Preserve the word order of the original sentence

13
Dependency Sub-Tree

A dependency tree D
Expresses dependency between words in the
sentence by child-parent relationships of nodes
Preserves syntactic relations between words in
the sentence
A dependency sub-tree D of a dependency tree D
Obtained by removing zero or more nodes from the
original tree
Preserves syntactic relations between words in
the original sentence

14
Frequent Sub-Pattern

The number of all sub-patterns(subsequences or
subtrees) is too large? Use only frequent
sub-patterns
Definition
A sentence contains a pattern if and only if the
pattern is a subsequence or a subtree of the
sentence
A support of a pattern is the number of sentences
containing the pattern in a dataset
If a support of a pattern is a given support
threshold or more, the pattern is frequent.(In
this experiment, we fixed support threshold to
10.)
As implementations for mining frequent
sub-patterns, we use Kudos Prefixspan and FREQT.

15
Table of Contents

1. Motivation
2. Our Approach
3. Experiments
Movie review dataset
Features
Classifiers and Tests
4. Result and Discussion
5. Conclusion and Future Work

16
Movie review dataset

Dataset 1 used in Pang 02, Mullen 04
690 positive reviews and 690 negative reviews
Written in English
3-fold cross-validation
Dataset 2 used in Pang 04
1000 positives and 1000 negatives
Written in English
10-fold cross-validation

17
Features

We employ the following features and their
combinations for the classification
Bag-of-words features
Unigram (ex good, film) uni
Unigram patterns appear in at least 2 distinct
sentences
Bigram (ex very good, film is) bi
Bigram patterns appear in at least 2 distinct
sentences
Frequent sub-pattern features
Word Sub-Sequence seq
Dependency Sub-tree dep
Features of lemmatized words
As in the extraction of the features uni, bi,
seq, dep,also extract unil, bil, seql, depl

18
Classifiers and Tests (1/2)

Classifier
Method SVMs, binary classifier based on
supervised learning
Kernel function linear kernel
Performance closely depends onits learning
parameter C (called soft margin parameter) ? We
carry out three kind of experiments

19
Classifiers and Tests (2/2)

Test 1 fix C as 1
The result is used for comparison to the early
studies
Test 2 best accuracy with C ? e-2.0, e-1.5, ,
e2.0
Observe potential performance of features
Use the result for finding the best effective
combination of bag-of-words features
Test 3 predict a proper value of C from training
data
Observe practical performance of features

20
Table of Contents

1. Motivation
2. Our Approach
3. Experiments
4. Results and Discussion
Results
Discussion
5. Conclusion and Future Work

21
Results (1/2)

Results for dataset 1
vs Pang 82.9 ? 87.3 (error reduction 26)vs
Mullen 84.6 ? 87.3 (error reduction 18)

22
Results (2/2)

Results for dataset 2
vs Pang 87.1 ? 92.9 (error reduction 45)

23
Discussion

From the results of the test1, our method proved
to be effective
Accuracy by features
bow dep ? bow dep seq (93)gtgt bow seq
(89) gt bow (87)
Lemmatized features are not always more effective
than the original ones

24
Table of Contents

1. Motivation
2. Our Approach
3. Experiments
4. Results and Discussion
5. Conclusion and Future Work
Conclusion
Future Work

25
Conclusion

We proposed a method for incorporating word order
and syntactic relations between words in a
sentence into document sentiment classification
by using frequent word sub-sequences and
dependency sub-trees as features.
Experimental results on movie review datasets
show that our classifiers obtained the best
results yet published using these datasets.

26
Future Work (1/2)

Negative/Interrogative Sentence
affirmative sentence This film is good. (1)
Negative sentence This film is not good. (2)
Interrogative sentence Is this film good? (3)
All sub-patterns in sentence (1) are also
contained in sentence (2).
Similarly, there is a large overlap of patterns
between (1) and (3).
Distinguishing these sentence-types would solve
these problems.

27
Future Work (2/2)

Incorporating discourse structures in a document
Example (positive movie review)
The scenario is simplistic.But I love this film.
By a word but, we would know thatI love this
film is a more important sentence than The
scenario is simplistic in the sense of
sentiment classification.

28
Thank you.
29
Examples of Weighed Patterns

Positive() weight shows positive sentiment
polarity
Negative(-) weight shows negative sentiment
polarity
The absolute value of each weight indicates how
large the contribution of the feature is

30
A Word Sequence A Clause

Sentences are too long to be used for mining
frequent sub-sequences
Instead of sentences, we used clauses of
sentences as word sequences
As in the figure on the right,We split a
sentence to a main clause and subordinate
clauseswith information of parse tree
In addition, we removed stopwords.
Conjunction, preposition, number, etc

31
References