Nave Bayes Classifier for Text Classification

About This Presentation

Title:

Description:

Number of Views:74

Avg rating:3.0/5.0

Slides: 13

Provided by: infor257

Category:

more less

Transcript and Presenter's Notes

Title: Nave Bayes Classifier for Text Classification

1
Naïve Bayes Classifier for Text Classification

2
Table of Content

3
Naïve Bayes Assumption

The assumption features are independent.
P(Fi C, Fj) P (Fi C), where Fi, Fj stand
for any two features and C is a class variable.
Theoretically oversimplified, but practically
works well
Parameter estimation is easy maximal likelihood

4
The Framework of Naïve Bayes Classifiers

P(CiF1, F2,,Fn) P(F1, F2, Fn) P(Ci)
P(F1, F2,FnCi)
P(CiF1, F2,,Fn) P(Ci) P(F1, F2,FnCi)
1/P(F1, F2, Fn)
P(CiF1, F2,,Fn) P(Ci) P(F1, F2,FnCi)
lambda
P(CiF1, F2,,Fn) lambda P(Ci) P(F1Ci)
P(F2,FnCi, F1)
P(CiF1, F2,,Fn) P(Ci) P(F1Ci)
P(F2,FnCi)
P(CiF1, F2,,Fn) P(Ci) P(F1Ci)
P(F2Ci)P(F3Ci)P(FnCi)
P(Ci) prior
Eg. A test dataset of 1000 apples, 200 are
originated from Japan, 300 from US, 500 from
China? P(Cjapan) 0.2, P(Cus) 0.3, P(Cchina)
0.5
P(FiCi) class-conditional probability
distribution (CPD)
P(FiCi) P(Fi, Ci)/P(Ci)
Eg. F1 big, small , F2 red, yellow,
green, out of 200 Japan apples, 80 are big, 120
are small among 300 US apples, 200 are red, 50
are yellow, etc
? P(F1bigCJapan) (80/1000) / (200/1000)
0.4, P(F1smallCJapan) 0.6
P(F2redCUS) 200/300 0.67,
P(F2yellowCUS) 50/300 0.17

5
The Framework of Naïve Bayes Classifiers (cont)

6
The multi-variable Bernoulli model

7
The multinomial model

8
Feature Selection

9
Empirical Evaluations

10
Results

Performance depends on the dataset and of
features
In most cases, multinomial model is better than
Bernoulli model
For Bernoulli model, performance usually
downgrades as of features increases.
For some datasets, 100 features works best for
all datasets, of features large than 10,000
wont help

11
Conclusions