Data%20Stream%20Management%20Systems%20Checkpoint - PowerPoint PPT Presentation

About This Presentation
Title:

Data%20Stream%20Management%20Systems%20Checkpoint

Description:

Bagging ... Bagging Ensemble Method. 13. 13. Mining Streams with Concept Changes ... of boosting ensembles is better than that of bagging ensembles [KDD04] ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 20
Provided by: mir135
Learn more at: http://web.cs.ucla.edu
Category:

less

Transcript and Presenter's Notes

Title: Data%20Stream%20Management%20Systems%20Checkpoint


1
Data Stream Management SystemsCheckpoint
  • CS240B Notes
  • by
  • Carlo Zaniolo
  • UCLA CSD
  • With slides from a KDD04 tutorial by
  • Haixun Wang, Jian Pei Philip Yu

2
Mining Data Streams Challenges
  • On-line response (NB), limited memory, most
    recent windows only
  • Fast Light algorithms needed
  • Must minimize usage of memory and CPU
  • Requires only one (or a few) passes through data
  • Concept shift/drift change mining set statistics
  • Render previously learned models inaccurate or
    invalid
  • Robustness and Adaptability quickly
    recover/adjust after concept changes.
  • Popular machine learning algorithms no longer
    effective
  • Neural nets slow learner requires many passes
  • Support Vector Machines (SVM) computationally
    expensive
  • Apriori many passes and expensive (association
    rule mine difficult for on data streams)

3
The Decision Tree Classifier
  • Learning (Training)
  • Input a data set of (a, b), where a is a vector,
    b a class label
  • Output a model (decision tree)
  • Testing
  • Input a test sample (x, ?)
  • Output a class label prediction for x

4
Decision Tree Classifiers
  • A divide-and-conquer approach
  • Simple algorithm, intuitive model
  • Typically a decision tree grows one level for
    each scan of data
  • Multiple scans are required
  • But if we can use small samples these problem
    disappears
  • But data structure is not stable
  • Subtle changes of data can cause global changes
    in the data structure

5
Stable Trees Using Samples
  • How many samples do we need to build a tree in
    constant time that is nearly identical to the
    tree a batch learner (C4.5, Sprint,...)
  • Nearly identical?
  • Categorical attributes
  • with high probability, the attribute we choose
    for split is the same attribute as would be
    chosen by a batch learner
  • identical decision tree
  • Continuous attributes
  • discretize them into categorical ones
  • ...Forget concept changes for now

6
Hoeffding Trees
  • Hoeffding bound is applied to the information
    gain
  • Error decreases when n ( of samples) increases
  • At each node, we shall accumulate enough samples
    (n) before we make a split
  • Scales better than traditional DT algorithms
  • Incremental the nodes are are created
    incrementally as news samples stream in
  • Sub-linear with sampling
  • Small memory requirement
  • Cons
  • Only consider top 2 attributes
  • Tie breaking takes time
  • Grow a deep tree takes time
  • Discrete attribute only

7
VFDT
  • Very Fast Decision Tree Domingos, Hulten 2000
  • Several Improvements faster and less memory
  • Concept Changes? A naïve approach
  • Place a sliding window on the stream
  • Reapply C4.5 or VFDT whenever window moves
  • Time consuming!

8
CVFDT
  • Concept-adapting VFDT
  • Hulten, Spencer, Domingos, 2001
  • Goal
  • Classifying concept-drifting data streams
  • Approach
  • Make use of Hoeffding bound
  • Incorporate windowing
  • Monitor changes of information gain for
    attributes.
  • If change reaches threshold, generate alternate
    subtree with new best attribute, but keep on
    background.
  • Replace if new subtree becomes more accurate.

9
Classifiers for Data Streams
  • Fast and Light Classifiers
  • Naïve Bayesian one pass to count occurrences
  • Sliding windows, tumbles and slides
  • Adaptive Nearest Neighbor Classification
    Algorithm--ANNCAD Fast and Light Classifiers
  • Ensembles of Classifiers--decision trees or
    others
  • Bagging Ensembles and
  • Boosting Ensembles

10
Basic Ideas
  • Stream partitioned into sequential chunks
  • Train a classifier from each chunk
  • Accuracy of voting ensembles is normally better
    than that of a single classfier.
  • Method1. Bagging
  • Weighted voting weights are assigned to
    classifiers based on their recent performance on
    the current test examples
  • Only top K classifiers are used
  • Method2. Boosting
  • Majority voting
  • Classifiers retired by age
  • Boosting used in training

11
Bagging Ensemble Method
12
Mining Streams with Concept Changes
  • Changes detected by drop in accuracy or by other
    methods
  • Build new classifiers on new windows
  • Search among old ones those that have now become
    accurate

13
Boosting Ensembles for Adaptive Mining of Data
Streams
  • Andrea Fang Chu, Carlo Zaniolo
  • PAKDD2004

14
Mining Data Stream Desiderata
  • Fast learning (preferably in one pass of the
    data.)
  • Light requirements (low time complexity, low
    memory requirement)
  • Adaptation (model always reflects the
    time-changing concept)

15
Adaptive Boosting Ensembles
  • Training stream is split into blocks (i.e.,
    windows)
  • Each individual classifier is learned from a
    block.
  • A boosting ensemble of (719 members) is
    maintained over time
  • Decisions are taken by simple majority
  • As the N1 classifier is build, boost the weight
    of the tuples misclassified by the first N
  • Change detection is explored to achieve
    adaptation.

16
Fast and Light
  • Experiments show that boosting ensembles of weak
    learners provide accurate prediction
  • Weak Learners
  • An aggressively pruned decision tree, e.g.,
    shallow tree (this means fast!)
  • Trained on a small set of examples (this mean
    light in memory requirements!)

17
Adaptation Detect changes that cause significant
drops in ensemble performance ?
gradual changes concept drift
abrupt changes concept schift
18
Adaptability
  • The error rate is viewed as a random variable
  • When it drops significantly from the recent
    average the whole ensemble is dropped
  • And a new one is quickly re-learned
  • Cost/performance of boosting ensembles is better
    than that of bagging ensembles KDD04
  • BUT ???

19
References
  • Haixun Wang, Wei Fan, Philip S. Yu, Jiawei Han.
    Mining Concept Drifting Data Streams using
    Ensemble Classifiers. In the ACM International
    Conference on Knowledge Discovery and Data Mining
    (SIGKDD) 2003.
  • Pedro Domingos, Geoff Hulten. Mining High Speed
    Data Streams. In the ACM International Conference
    on Knowledge Discovery and Data Mining (SIGKDD)
    2000.
  • Geoff Hulten, Laurie Spencer, Pedro Domingos.
    Mining Time-Changing Data Streams. In the ACM
    International Conference on Knowledge Discovery
    and Data Mining (SIGKDD) 2001.
  • Wei Fan, Yi-an Huang, Haixun Wang, Philip S Yu.
    Active Mining of Data Streams. In the SIAM
    International Conference on Data Mining (SIAM DM)
  • 2004Fang Chu, Yizhou Wang, Carlo Zaniolo, An
    adaptive learning approach for noisy data
    streams, 4th IEEE International Conference on
    Data Mining (ICDM), 2004
  • Fang Chu, Carlo Zaniolo Fast and Light Boosting
    for Adaptive Mining of Data Streams. PAKDD 2004
    282-292.
  • Yan-Nei Law, Carlo Zaniolo, An Adaptive Nearest
    Neighbor Classification Algorithm for Data
    Streams, 2005 ECML/PKDD Conference, Porto,
    Portugal, October 3-7, 2005.
Write a Comment
User Comments (0)
About PowerShow.com