Title: Parallelizing FPgrowth Frequent Patterns Mining Algorithm Using OpenMP
1Parallelizing FP-growth Frequent Patterns Mining
Algorithm Using OpenMP
2Parallel Computing and Data Mining
- Performance issues in data mining - scalability
- Parallel Computing, What is it?
- With parallel computing, a parallel program can
either - decrease the runtime to solve a problem
- Increase the size of the problem that can be
solved. - Parallel Computing gives you more performance to
throw at your problems - Data mining problem can benefit from parallel
computing. - Intensive computation
- Mining under time constraint
3FP-Growth algorithm brief review
- Mining frequent patterns without candidate
generation - Frequent pattern tree (FP-Tree)
- FP-Tree based pattern growth mining method
main() (1) find frequent 1-items (2)
build a global FP-tree from transaction
database (3) fptree_mining() //
recursively mine FP-tree
4FP-Growth Sequential algorithm
- Procedure fptree_mining()
- Input FP-tree item_table frequent 1-item
set cond -
- (1) for each item ain the frequent item_table do
- (2) if tree at this level contains only a
single path - (3) then branch_mine() // generate
all patterns along a path - (4) else
- (5) cond_table
conditional_table_construct() // construct a
conditional transaction table for the condition
and get frequent 1-item set in cond_item_table - (6) cond_fptree
fptree_conditional_build(cond_table) // build an
conditional FP-tree using conditional table - (7) cond cond a //
pattern growth by appending a - (8) fptree_mining(cond_fptree,
cond_item_table, cond, condpt1) // recursively
mine conditional fptree - (9)
- (10)
5OpenMP programming model
- Multi-threaded programming model
- Fork-Join Parallelism
- Master thread spawns a team of threads
- Parallelism is added incrementally i.e. the
sequential program evolves into a parallel
program.
6OpenMP How to parallelize?
- OpenMP is usually used to parallelize loops
- Find most time consuming loops.
- Split them up between threads.
- Simple example
Split-up this loop between multiple threads
void main() double Res1000 for(int
i0ilt1000i) compute(Resi)
Sequential Program
void main() double Res1000 pragma omp
parallel for for(int i0ilt1000i)
compute(Resi) Parallel Program
7OpenMP Synchronization
- OpenMP is a shared memory model.
- Threads communicate by sharing variables.
- Unintended sharing of data can lead to race
conditions - The programs outcome changes as the threads are
scheduled differently (nondeterministic) - To control race conditions
- Use synchronization to protect data conflicts.
- Synchronization is expensive so
- Change how data is stored to minimize the need
for synchronization.
8Parallelization of FP-Growth
- Task Parallelize the for loop gt parallel for
- Parallelizable conceptually each iteration is a
subtask can execute in any order - But not ready yet due to race condition of global
variables - Two strategies to handle race condition
- Privatize
- Condition pattern variable, passed as argument in
recursive function, but only used in one
iteration - Critical section/Atomic
- Global frequent counter array, updated
concurrently by all threads, use synchronization
to prevent conflicts
9Load balance
- Load imbalance problem
- Work load in each iteration can vary dramatically
- Dynamic Thread scheduling in OpenMP
- Thread executes chunk of iterations and waits for
another assignment from work pool - Compiler support!
- Better load balancing if sort iterations by work
load (hint)
Work Pool
Thread1
Thread2
Thread3
Thread0
10GuideView screenshots
11Optimization reduces sync overhead
- Remove synchronization
- Reduction on array
- Reduction with SUM operation on an array
- However, OpenMP reduction support is only for
scalable variable - Manually implement array based reduction
- Manually allocate counter array for each thread
- Each thread work on its own copy of array
- Sum up when loop finishes
12GuideView screenshots after optimization
13(No Transcript)
14Speedup
15Future work
- Mining result frequent patterns can be stored
into some compressed tree structure - Each thread maintains its own private compressed
tree to store result patterns - Combine these tree (sum up) when mining finished.
16Conclusion
- New parallelized FP-Growth program
- Removed lock/synchronization from the parallel
region - good parallel speedup and parallel efficiency
- Good data mining algorithm designed with
divide-and-conquer strategy is easier to
parallelize - OpenMP is efficient for parallelizing data mining
algorithm on SMP machines can improve in the
syntax of array privatization and reduction
17Acknowledgements
- Using Intel Threading Tools (Guide and GuideView)
from Intel (license from Intel) - Help from Intel Parallel Applications Center in
performance tuning tool