Sachin Singh - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

Sachin Singh

Description:

Current work focuses largely on generation of trees. Efficient algorithms. Disk Resident gigantic data sources. Improving accuracy of the generated models. Motivation ... – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 17
Provided by: sachin77
Category:
Tags: gigantic | sachin | singh

less

Transcript and Presenter's Notes

Title: Sachin Singh


1
CS 551 Research Track
Filtering and Comparing of Classification trees
using XML
  • - Sachin Singh

2
Data Mining - Concepts
  • Extracting meaningful knowledge from huge chunk
    of raw data.
  • Types
  • Association
  • Classification
  • Temporal

3
Classification Method
  • Prediction model
  • The C4.5 Tree algorithm

4
Classification Tree
5
Analysis of Trees
  • Current work focuses largely on generation of
    trees
  • Efficient algorithms
  • Disk Resident gigantic data sources
  • Improving accuracy of the generated models
  • Motivation
  • Current research area need for analysis

6
Areas of Analysis
  • Two Sub Problems
  • Filtering Sub Problem
  • Comparison Sub Problem

7
Filtering Sub Problem
  • Typical data warehouses are huge !!
  • Generation of Bushy trees
  • Not all outcomes are significant
  • Need to filter trees based on the required
    outcomes

8
Filtering Sub Problem

Filtered Classification Tree
Full Classification Tree
9
Filtering Sub Problem
  • Advantages
  • Efficient querying. Faster results
  • Easy Managed
  • Useful for comparison sub problem

10
Comparison Sub Problem
  • Need to monitor changes in data trends by
    comparing the classification trees
  • Levels of changes identified
  • Change in test (partition) value
  • Change in the partitions
  • Change in node levels
  • Change in outcome(leaves)

11
Comparison Sub Problem
  • Issues
  • Structure of trees unpredictable
  • Comparing two trees with no standard structure

12
Solution
  • XML Trees
  • Convert the tree structure in XML files
  • XML inherently tree structure
  • Take advantage of existing XML related
    technologies
  • Standard specs

13
Solution Proposed File format
14
Approach
  • Devise Algorithms to solve filtering and
    comparison problems
  • Analyzing results of comparison in logical terms
  • Measuring efficiency of the algorithms through
    time and space complexities

15
Progress
16
Suggestions Preferred !!
Over questions !!
Thank You !!
Write a Comment
User Comments (0)
About PowerShow.com