On Heuristics for Learning Model Trees - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

On Heuristics for Learning Model Trees

Description:

Auto-mpg. 0.10 0.05. 8 2. 40.84% 40.30% Auto-price. Time. M5' Mauve. Leaves. M5' Mauve. Error. M5' Mauve. Data set. 0.27 0.16. 18 18 ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 20
Provided by: Cel56
Category:

less

Transcript and Presenter's Notes

Title: On Heuristics for Learning Model Trees


1
On Heuristics for Learning Model Trees
  • Celine Vens and Hendrik Blockeel
  • Katholieke Universiteit Leuven
  • celine, hendrik_at_cs.kuleuven.ac.be

2
Overview
  • Introduction
  • Variance based approach
  • A new heuristic
  • Experimental results
  • Conclusions

3
Introduction
  • Model Tree
  • Regression tree with linear models in leaves
  • TDIDT instantiation
  • Split criterion min p(L).Imp(L) p(R).Imp(R)
  • Existing approaches
  • M5
  • Retis

4
Variance based approach
  • Examples M5, M5
  • Split criterion
  • min p(L).Var(L) p(R).Var(R)
  • But
  • Quality of linear model independent of variance!!

5
Variance based approach
  • Illustrating example
  • Simplest model
  • YX if 0ltXlt1
  • Y2-X if
    1ltXlt2
  • Total variance variance left variance right

6
Variance based approach
  • What split is chosen?
  • The split Xltc that minimizes the average variance
    in both subsets.
  • c is computed by minimizing
  • h(c) c.Var(YXltc) (2-c).Var(YXgtc)
  • assume clt1
  • c 0.4

7
Variance based approach
  • M5 splits at c0.4
  • ?
  • random split 60 chance to do better!
  • Left branch linear, right branch analogous

8
Variance based approach
  • Final tree built by M5
  • LM1 y 0.00285 1.01x
  • LM2 y 0.577
  • LM3 y 0.713
  • LM4 y 0.795
  • LM5 y 0.871
  • LM6 y 0.956
  • LM7 y 0.859
  • LM8 y 0.743
  • LM9 y 1.53 0.675x
  • LM10 y 1.85 0.92x

9
Variance based approach
  • Tends to split in the wrong places
  • Reduces explanatory power
  • Trees too large
  • Splits not informative
  • May reduce predictive performance
  • Superfluous partitioning ? small areas with few
    examples ? local models less accurate

10
A new heuristic
  • Variance doesnt work well as a heuristic
  • Constructing linear models for the subsets and
    evaluating their goodness should work better
  • cfr Retis building multiple linear model for
    subsets and computing their residual variance
    (complexity O(attr ³) )

11
A new heuristic
  • 4 options to evaluate split when considering
    multiple predictor attributes
  • no regression (cfr M5)
  • simple regression on split attribute
  • simple regression on all attributes separately
  • multiple regression on all attributes together
  • (cfr Retis)

12
A new heuristic
  • Our choice option 2
  • Complexity close to option 1 (O(attr) )
  • Provides solution for undesirable behavior
  • Evaluating predicting power for each attribute
    independently
  • MAUVE M5 adapted to univariate regression

13
Experimental results synthetic data set, 1
predictor variable
14
Experimental resultssynthetic data set, 1
predictor variable
15
Experimental resultssynthetic data set, 1
predictor variable
16
Experimental resultssynthetic data set, 2
predictor variables
17
Experimental resultssynthetic data set, 2
predictor variables
18
Experimental resultsreal world data sets
19
Conclusions
  • Use of variance reduces explanatory power
  • New approach with complexity same order as M5
  • Experiments on synthetic data sets simpler trees
    with equal predictive accuracy
  • Experiments on real world data sets more
    ambiguous
Write a Comment
User Comments (0)
About PowerShow.com