LINEAR CLASSIFICATION METHODS - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

LINEAR CLASSIFICATION METHODS

Description:

COMMENT: This method has a larger classification error rate than ... Comment: ... Comment: the classification error is the smallest among all methods and the ... – PowerPoint PPT presentation

Number of Views:51
Avg rating:3.0/5.0
Slides: 16
Provided by: Fxu8
Category:

less

Transcript and Presenter's Notes

Title: LINEAR CLASSIFICATION METHODS


1
LINEAR CLASSIFICATION METHODS
  • STAT 597 E
  • Fengjuan Xuan
  • Caimiao Wei
  • Bogdan Ilie

2
Introduction
  • The observations in the dataset we will work on
    (BUPA liver disorders) were sampled by BUPA
    Medical Research Ltd and consist of 7 variables
    and 345 observed vectors. The first 5 variables
    are measurements taken by blood tests that are
    thought to be sensitive to liver disorders and
    might arise from excessive alcohol consumption.
    The sixth variable is a sort of selector
    variable. The subjects are single male
    individuals. The seventh variable is a selector
    on the dataset, being used to split it into two
    sets, indicating the class identity. Among all
    the observations, there are 145 people belonging
    to the liver-disorder group (corresponding to
    selector number 2) and 200 people belonging to
    the liver-normal group.

3
Description of variables
  • The description of each variable is below
  • 1. mcv mean corpuscular volume
  • 2. alkphos alkaline phosphotase
  • 3. sgpt alamine aminotransferase
  • 4. sgot aspartate aminotransferase
  • 5. gammagt gamma-glutamyl transpeptidase
  • 6. drinks number of half-pint equivalents of
    alcoholic beverages
  • drunk per day
  • 7. selector field used to split data into
    two sets. It is a binary categorical variable
    with indicators 1 and 2 ( 2 corresponding to
    liver disorder)

4
Matrix Plot of the variables
5
Logistic regression in full Space
  • Coefficients
  • Value Std. Error t value
  • (Intercept) 5.99024204 2.684250011 2.231626
  • mcv -0.06398345 0.029631551 -2.159301
  • alk -0.01952510 0.006756806 -2.889694
  • sgpt -0.06410562 0.012283808 -5.218709
  • sgot 0.12319769 0.024254150 5.079448
  • gammagt 0.01894688 0.005589619 3.389656
  • drinks -0.06807958 0.040358528 -1.686870
  • So the classification rule is G(x)

6
Classification error rate
  • the classification error on the whole training
    data set.
  • error rate 0.2956
  • Sensitivity 0.825
  • Specificity 0.5379
  • The error rate and its standard error
    obtained by 10-fold cross validation
  • error rate(Standard Error) 0.307461384336384
    (0.0271)
  • Sensitivity(Standard Error) 0.816280482802222
    (0.0203)
  • Specificity(Standard Error) 0.531134992458522
    (0.0699)

7
Backward step wise model selection based on AIC
  • Five variables are selected after step-wise model
    selection. The first variable MCV is deleted.
  • error rate(Standard Error) 0.329460817156602
    (0.03051)
  • Sensitivity(Standard Error) 0.792109881015521
    (0.03433)
  • Specificity(Standard Error) 0.507341628959276
    (0.03863)
  • COMMENT
  • This method has a larger classification error
    rate than the original one. Using stepwise
    doesnt improve classification

8
Scree plot for the PCA
9
The performance of the Logistic regression on the
reduced space
  • The reduced space is obtained by selecting the
    first three principle components. The standard
    error is obtained by 10 fold cross validation.
  • error rate(Standard Error) 0.456256232089833
    (0.023414)
  • Sensitivity(Standard Error) 0.372869939127443
    (0.031675)
  • Specificity(Standard Error) 0.783003663003663
    (0.030785)
  • Comment
  • the classification error rate is around 50,
    which is not much better than the random
    guessing.

10
The classification plot on the first two
principle components plane
11
Linear Discriminant Analysis
  • LDA assumes a multivariate normal distribution,
    so we make some log transformations on some
    variables.
  • Y1mac Y2log(alk)
  • Y3log(sgpt) Y4log(sgpt)
  • Y5log(gammat) Y6log(dringks1)

12
The histogram of the sgpt variable and its log
transformation
13
The performance of the LDA based on Transformed
data
  • Comment the classification error is the smallest
    among all methods and the sensitivity is the
    largest
  • error rate 0.263768115942029
  • Sensitivity 0.865
  • Specificity 0.558620689655172
  • By the log transformation, we make the assumption
    of multivariate normality reasonable. So the
    classification becomes better.

14
LDA after PCA
  • error rate 0.411594202898551
  • Sensitivity 0.88
  • Specificity 0.186206896551724
  • Comment
  • the performance is not improved by PCA

15
Conclusion
  • Four different methods are applied to the liver
    disorder data set. The LDA based on the
    transformed variables works best and the Logistic
    regression based on the original data set second.
  • The classification method based on the principle
    component doesnt work well. Although the first
    three principle components contain more than 97
    variation, we may still lose the most important
    information for classification.
  • The transformations can make the LDA method work
    better in some cases. The LDA assumes the
    normality distribution which is a very strong
    assumption in many data sets. For example, in our
    data, all variables except the first one are
    seriously skewed. That is why log transform works.
Write a Comment
User Comments (0)
About PowerShow.com