Issues in Parallelizing Pattern Mining Workloads - PowerPoint PPT Presentation

1 / 13
About This Presentation
Title:

Issues in Parallelizing Pattern Mining Workloads

Description:

Given: a large database of transactions, a threshold s ... Pre-fetching. Speculative Execution (1) P1. P2. P3. Task to be executed. Executed Task ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 14
Provided by: MD7090
Category:

less

Transcript and Presenter's Notes

Title: Issues in Parallelizing Pattern Mining Workloads


1
Issues in Parallelizing Pattern Mining Workloads
  • Shirish Tatikonda
  • CSE 788Z11

05th December, 2006
2
Outline
  • Pattern mining algorithms
  • Parallelization
  • Issues
  • Possible solutions

3
Pattern Mining Algorithms
  • Patterns
  • Transactions
  • Sequences
  • Trees
  • Graphs
  • Goal
  • Given a large database of transactions,
  • a threshold s
  • Need the of set of all frequent patterns
  • Frequent of occurrences s

Walmart Data T1 Milk, Bread, Beer, Diaper T2
Milk, Beer T3 Milk, Bread, Ice cream T4 Bread,
Beer, Diaper s 3 Frequent Milk, Bread,
Beer, Bread,Beer
4
Generic Approach
  • Candidate Generation
  • Level-wise approach
  • Pattern growth approach
  • Support Counting
  • Evaluate each candidate

5
Pattern growth approach
  • Depth-first traversal
  • Equivalence classes
  • Seed pattern
  • Growing method
  • Apriori Principle
  • For pruning

Search Space Traversal
Infrequent
Seed
Frequent
Pruned
6
Parallelization
  • Coarse-grain Equivalence Class Level
  • Load balancing
  • Fine-grain Each Pattern Level
  • High overhead

P1
P2
P3
P4
P5
P6
Parallelizing equivalence classes
7
Issues
  • Considering only the shared-memory systems
  • Load imbalance
  • In case of coarse-grain parallelization
  • Speculative Execution
  • Large working sets
  • In case of CMPs
  • Algorithmic rather than architectural solutions

8
Related Work
  • Thread-Level Speculation (TLS)
  • Mitosis compiler PLDI 2005
  • Speculative thread level parallelism (SpMT)
  • Speculative threads
  • Executes a part of the original application
  • When finished, speculation is verified
  • Determines the spawning points
  • Loops, basic blocks, subroutines etc.
  • Inter-thread dependencies
  • Synchronization
  • Value prediction using pre-computation(p-) slices
  • Compute p-slices using speculative optimizations

9
Related Work
  • POSH compiler PPOPP 2006
  • Fully automated TLS compiler
  • Three phases
  • Task selection
  • based on programs substructures
  • value prediction
  • Spawn hoisting
  • Task refinement
  • to improve the quality of chosen tasks
  • through profiling

Parallelism
Pre-fetching
10
Speculative Execution (1)
P2
P1
P3
11
Speculative Execution (2)
  • Performance improvement
  • High
  • Marginal
  • None
  • Negative
  • Depends on dataset characteristics
  • Characterization
  • Application in data mining
  • Speculatively execute future iterations
  • Tracking followed by classification

12
Large Working Sets
  • Support Counting
  • Evaluate a candidate pattern
  • Scan a subset of database transactions
  • Working set can potentially be large
  • CMP (vs SMP)
  • Less cache memory available to each processor
  • Less main memory shared by all processors

Working set has to be reduced to
limit the bandwidth utilization
13
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com