New software facilities

About This Presentation

Title:

New software facilities

Description:

... database, pruned tree (Target ... It does not require a separate pruning set. ... Also, subtrees are not pruned only if their corrected error estimates is at ... – PowerPoint PPT presentation

Number of Views:20

Avg rating:3.0/5.0

Slides: 19

Provided by: evan60

Category:

more less

Transcript and Presenter's Notes

Title: New software facilities

1
New software facilities

Developed between
Dec 2004 and Feb 2005

2
New implemented facilities

Automatic and manual splits for
Gini index of impurity.(used for classification)
Variance measure of impurity.(used for
regression)

3
Manual Split
4
Manual Split
5
Installer
6
General Options
7
CRT options
Obs For a large datasets use higher numbers
8
New implemented facilities

Pruning using the Pessimistic Error Pruning
algorithm (PEP) can be used only for pruning
classification trees by Ross Quinlan (Quinlan,
1987).
Data transference from any node to another one.
Disposal of sub nodes of any node, except leaves.

9
New implemented facilities

Data filter and the selection of blocks of data
Uses the set operations Union and
Intersection.
The variables can be either categorical or
numerical.
The comparison operators are , lt, lt, gt, gt and
ltgt (different).
The null keyword matches blank values.
Silly-proof functions.
The data block can be shown before transference.

10
Data filter and the selection of blocks of data
11
Spartacus ICU database, full grow tree (Target
outcome, 49 nodes)
12
Answer tree ICU database, full grown tree
(Target outcome, 11 nodes)
13
Spartacus ICU database, pruned tree (Target
outcome, 11 nodes)
14
Answer tree ICU database, pruned tree (Target
outcome, 1 node only )
15
Coming soon

Tree misclassification tests.
C4.5 implementation.
More prune methods.
Globalized numeric formats.
Recalculation of the gain if a custom split is
applied.
Full support for missing values.

16
QA

Any specific facility for the next meeting?
See you by the end of May?
http//evandro.org/calendar

17
(No Transcript)
18
PEP

PEP (pessimistic error pruning) algorithm. PEP is
a post-prune top down algorithm. It does not
require a separate pruning set. PEP try to
compensate the overly optimistic estimates based
on resubstitution error ( error based on the
training sample tested with itself). These
estimates are overly optimistic since error rates
on training data are typically lower than on the
test data. PEP attempts to get a more accurate
error estimate by imposing correction for the
binomial distribution by adding 0.5 to the number
of errors associated with each node. Also,
subtrees are not pruned only if their corrected
error estimates is at least one standard error
less than the estimated error of their root node.

Write a Comment

User Comments (0)

About PowerShow.com

New software facilities - PowerPoint PPT Presentation

New software facilities

... database, pruned tree (Target ... It does not require a separate pruning set. ... Also, subtrees are not pruned only if their corrected error estimates is at ... – PowerPoint PPT presentation