Data Mining Classification: Na - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Data Mining Classification: Na

Description:

Classification: Na ve Bayes Classifier. Lecture Notes for Chapter 4 &5 ... Na ve Bayes and Bayesian Belief Networks. Support Vector Machines. Example of a ... – PowerPoint PPT presentation

Number of Views:86
Avg rating:3.0/5.0
Slides: 27
Provided by: cha128
Category:

less

Transcript and Presenter's Notes

Title: Data Mining Classification: Na


1
Data Mining Classification NaĂŻve Bayes
Classifier
  • Lecture Notes for Chapter 4 5
  • Introduction to Data Mining
  • by
  • Tan, Steinbach, Kumar

2
Classification Definition
  • Given a collection of records (training set )
  • Each record contains a set of attributes, one of
    the attributes is the class.
  • Find a model for class attribute as a function
    of the values of other attributes.
  • Goal previously unseen records should be
    assigned a class as accurately as possible.
  • A test set is used to determine the accuracy of
    the model. Usually, the given data set is divided
    into training and test sets, with training set
    used to build the model and test set used to
    validate it.

3
Illustrating Classification Task
4
Examples of Classification Task
  • Predicting tumor cells as benign or malignant
  • Classifying credit card transactions as
    legitimate or fraudulent
  • Classifying secondary structures of protein as
    alpha-helix, beta-sheet, or random coil
  • Categorizing news stories as finance, weather,
    entertainment, sports, etc

5
Classification Techniques
  • Decision Tree based Methods
  • Rule-based Methods
  • Memory based reasoning
  • Neural Networks
  • NaĂŻve Bayes and Bayesian Belief Networks
  • Support Vector Machines

6
Example of a Decision Tree
Splitting Attributes
Refund
Yes
No
MarSt
NO
Married
Single, Divorced
TaxInc
NO
lt 80K
gt 80K
YES
NO
Model Decision Tree
Training Data
7
Another Example of Decision Tree
categorical
categorical
continuous
class
Single, Divorced
MarSt
Married
Refund
NO
No
Yes
TaxInc
lt 80K
gt 80K
YES
NO
There could be more than one tree that fits the
same data!
8
Decision Tree Classification Task
Decision Tree
9
Apply Model to Test Data
Test Data
Start from the root of tree.
10
Apply Model to Test Data
Test Data
11
Apply Model to Test Data
Test Data
Refund
Yes
No
MarSt
NO
Married
Single, Divorced
TaxInc
NO
lt 80K
gt 80K
YES
NO
12
Apply Model to Test Data
Test Data
Refund
Yes
No
MarSt
NO
Married
Single, Divorced
TaxInc
NO
lt 80K
gt 80K
YES
NO
13
Apply Model to Test Data
Test Data
Refund
Yes
No
MarSt
NO
Married
Single, Divorced
TaxInc
NO
lt 80K
gt 80K
YES
NO
14
Apply Model to Test Data
Test Data
Refund
Yes
No
MarSt
NO
Assign Cheat to No
Married
Single, Divorced
TaxInc
NO
lt 80K
gt 80K
YES
NO
15
Bayes Classifier
  • A probabilistic framework for solving
    classification problems
  • Conditional Probability
  • Bayes theorem

16
Example of Bayes Theorem
  • Given
  • A doctor knows that meningitis causes stiff neck
    50 of the time
  • Prior probability of any patient having
    meningitis is 1/50,000
  • Prior probability of any patient having stiff
    neck is 1/20
  • If a patient has stiff neck, whats the
    probability he/she has meningitis?

17
Bayesian Classifiers
  • Consider each attribute and class label as random
    variables
  • Given a record with attributes (A1, A2,,An)
  • Goal is to predict class C
  • Specifically, we want to find the value of C that
    maximizes P(C A1, A2,,An )
  • Can we estimate P(C A1, A2,,An ) directly from
    data?

18
Bayesian Classifiers
  • Approach
  • compute the posterior probability P(C A1, A2,
    , An) for all values of C using the Bayes
    theorem
  • Choose value of C that maximizes P(C A1, A2,
    , An)
  • Equivalent to choosing value of C that maximizes
    P(A1, A2, , AnC) P(C)
  • How to estimate P(A1, A2, , An C )?

19
NaĂŻve Bayes Classifier
  • Assume independence among attributes Ai when
    class is given
  • P(A1, A2, , An C) P(A1 Cj) P(A2 Cj) P(An
    Cj)
  • Can estimate P(Ai Cj) for all Ai and Cj.
  • New point is classified to Cj if P(Cj) ? P(Ai
    Cj) is maximal.

20
How to Estimate Probabilities from Data?
  • Class P(C) Nc/N
  • e.g., P(No) 7/10, P(Yes) 3/10
  • For discrete attributes P(Ai Ck)
    Aik/ Nc
  • where Aik is number of instances having
    attribute Ai and belongs to class Ck
  • Examples
  • P(StatusMarriedNo) 4/7P(RefundYesYes)0

k
21
How to Estimate Probabilities from Data?
  • For continuous attributes
  • Discretize the range into bins
  • one ordinal attribute per bin
  • violates independence assumption
  • Two-way split (A lt v) or (A gt v)
  • choose only one of the two splits as new
    attribute
  • Probability density estimation
  • Assume attribute follows a normal distribution
  • Use data to estimate parameters of distribution
    (e.g., mean and standard deviation)
  • Once probability distribution is known, can use
    it to estimate the conditional probability P(Aic)

k
22
How to Estimate Probabilities from Data?
  • Normal distribution
  • One for each (Ai,ci) pair
  • For (Income, ClassNo)
  • If ClassNo
  • sample mean 110
  • sample variance 2975

23
Example of NaĂŻve Bayes Classifier
Given a Test Record
  • P(XClassNo) P(RefundNoClassNo) ?
    P(Married ClassNo) ? P(Income120K
    ClassNo) 4/7 ? 4/7 ? 0.0072
    0.0024
  • P(XClassYes) P(RefundNo ClassYes)
    ? P(Married ClassYes)
    ? P(Income120K ClassYes)
    1 ? 0 ? 1.2 ? 10-9 0
  • Since P(XNo)P(No) gt P(XYes)P(Yes)
  • Therefore P(NoX) gt P(YesX) gt Class No

24
NaĂŻve Bayes Classifier
  • If one of the conditional probability is zero,
    then the entire expression becomes zero
  • Probability estimation

c number of classes p prior probability m
parameter
25
Example of NaĂŻve Bayes Classifier
A attributes M mammals N non-mammals
P(AM)P(M) gt P(AN)P(N) gt Mammals
26
NaĂŻve Bayes (Summary)
  • Robust to isolated noise points
  • Handle missing values by ignoring the instance
    during probability estimate calculations
  • Robust to irrelevant attributes
  • Independence assumption may not hold for some
    attributes
  • Use other techniques such as Bayesian Belief
    Networks (BBN)
Write a Comment
User Comments (0)
About PowerShow.com