Frequent Itemset Mining on Graphics Processors - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Frequent Itemset Mining on Graphics Processors

Description:

Wenbin Fang , Mian Lu, Xiangye Xiao, Bingsheng He1, Qiong Luo. Hong Kong Univ. of Sci. ... Presenter: Wenbin Fang. 2 /33. Outline. Contribution. Introduction ... – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 35
Provided by: ins88
Category:

less

Transcript and Presenter's Notes

Title: Frequent Itemset Mining on Graphics Processors


1
Frequent Itemset Mining on Graphics Processors
  • Wenbin Fang , Mian Lu, Xiangye Xiao, Bingsheng
    He1, Qiong Luo
  • Hong Kong Univ. of Sci. and Tech.
  • Microsoft Research Asia1

Presenter Wenbin Fang
2
Outline
  • Contribution
  • Introduction
  • Design
  • Evaluation
  • Conclusion

2/33
3
Contribution
  • Accelerate the Apriori algorithm for Frequent
    Itemset Mining using Graphics Processors (GPUs).
  • Two GPU implementations
  • Pure Bitmap-based implementation (PBI)
    processing entirely on the GPU.
  • Trie-based implementation (TBI) GPU/CPU
    co-processing.

3/33
4
Frequent Itemset Mining (FIM)
Finding groups of items, or itemsets that
co-occur frequently in a transaction database.
Minimum support 2 1-itemsets (frequent items) A
3 B 2 C 3 D 4
4/33
5
Frequent Itemset Mining (FIM)
Aims at finding groups of items, or itemsets that
co-occur frequently in a transaction database.
Minimum support 2 1-itemsets (frequent
items) A, B, C, D 2-itemsets AB 2 AC 2 AD
3 BD 2 CD 3
5/33
6
Frequent Itemset Mining (FIM)
Aims at finding groups of items, or itemsets that
co-occur frequently in a transaction database.
Minimum support 2 1-itemsets (frequent
items) A, B, C, D 2-itemsets AB, AC, AD, BD,
CD 3-itemsets ABD, ACD
6/33
7
Graphics Processors (GPUs)
  • Exist in commodity machines, mainly for graphics
    rendering.
  • Specialized for compute-intensive, highly data
    parallel apps.
  • Compared with CPUs, GPUs provide 10x faster
    computational horsepower, and 10x higher memory
    bandwidth.

CPU
GPU
--From NVIDA CUDA Programming Guide
7/33
8
Programming on GPUs
  • OpenGL/DirectX
  • AMD CTM
  • NVIDIA CUDA

SIMD parallelism (Single Instruction, Multiple
Data)
The many-core architecture model of the GPU
8/33
9
Hierarchical multi-threaded in NVIDIA CUDA
Thread Block
Thread Block
Warp
Warp
Warp
Warp
Warp
Warp






A warp 32 GPU threads gt SIMD schedule unit.
of threads in a thread block
of thread blocks
9/33
10
General Purpose GPU Computing (GPGPU)
  • Applications utilizing GPUs
  • Scientific computing
  • Molecular Dynamics Simulation
  • Weather forecasting
  • Linear algebra
  • Computational finance
  • Folding_at_home, Seti_at_home
  • Database applications
  • Basic DB Operators SIGMOD04
  • Sorting SIGMOD06
  • Join SIGMOD08

10/33
11
Our work
  • As a first step, we consider the GPU-based
    Apriori, with intention to extend to another
    efficient FIM algorithm -- FP-growth.
  • Why Apriori?
  • a classic algorithm for mining frequent itemsets.
  • also applied in other data mining tasks, e.g.,
    clustering, and functional dependency.

11/33
12
The Apriori Algorithm
Input 1) Transaction Database 2) Minimum
support Output All frequent itemsets L1 All
frequent 1-itemsets k 2 While (Lk-1 ! empty)
//Generate candidate k-itemsets.
Ck lt- Self join on Lk-1 Ck lt-
(K-1)-Subset test on Ck //Generate
frequent k-itemsets Lk lt- Support
Counting on Ck k 1
Frequent 1-itemsets
Candidate 2-itemsets
Frequent 2-itemsets
Candidate 3-itemsets
Frequent 3-itemsets

Candidate (K-1)-itemsets
Frequent (K-1)-itemsets
Candidate K-itemsets
Frequent K-itemsets
12/33
13
Outline
  • Contribution
  • Introduction
  • Design
  • Evaluation
  • Conclusion

13/33
14
GPU-based Apriori
Input 1) Transaction Database 2) Minimum
support Output All frequent itemsets
Pure Bitmap-based Impl. (PBI)
Trie-based Impl. (TBI)
L1 All frequent 1-itemsets k 2 While (Lk-1
! empty) //Generate candidate
k-itemsets. Ck lt- Self join on Lk-1
Ck lt- (K-1)-Subset test on Ck //Generate
frequent k-itemsets Lk lt- Support
Counting on Ck k 1
L1 All frequent 1-itemsets k 2 While (Lk-1
! empty) //Generate candidate
k-itemsets. Ck lt- Self join on Lk-1
Ck lt- (K-1)-Subset test on Ck //Generate
frequent k-itemsets Lk lt- Support
Counting on Ck k 1
Itemsets bitmap Candidate generation on the GPU
Itemsets Trie Candidate generation on the CPU
Transactions bitmap Support counting on the GPU
Transactions bitmap Support counting on the GPU
14/33
15
Horizontal and Vertical data layout
Vertical data layout
Horizontal data layout
Support counting is done on specific itemsets.
Scan all transactions
  • Intersect two transaction lists.
  • Count the number of transactions
  • in the intersection result.

15/33
16
Bitmap representation for transactions
of transactions
Intersection bitwise AND operation
Counting of 1s in a string of bits
of itemsets
16/33
17
Lookup table
Lookup table
Bitmap representation for transactions
of 1s TABLE12 // decimal 12 // binary
1100 // (a string of bits)
1 byte
  • Constant memory
  • Cacheable
  • 64 KB
  • Shared by all GPU threads

216 65536
17/33
18
Support Counting on the GPU (Cont.)
Thread block 1
Thread block 2
LOOKUP TABLE
2
  • Intersect two transaction lists.
  • Count the number of transaction
  • in the intersection result.

18/33
19
Support Counting on the GPU (Cont.)
19/33
Thread Block
Access vector type int4 In one instruction
Example
Thread 1
Thread 2
AB
int
int
int
int
int
int
int
int
AND
AND
AND
AND
AND
AND
AND
AND
AD
int
int
int
int
int
int
int
int
ABD
int
int
int
int
int
int
int
int
LOOKUP TABLE
Counts 2
Counts of 1s for every 16-bit integer
Parallel Reduce
Support for this itemset
Support2
20
GPU-based Apriori
Input 1) Transaction Database 2) Minimum
support Output All frequent itemsets L1 All
frequent 1-itemsets k 2 While (Lk-1 ! empty)
//Generate candidate k-itemsets.
Join Subset test //Generate
frequent k-itemsets Support Counting
k 1
  • Candidate Generation
  • Join
  • e.g., Join two 2-itemsets to obtain a candidate
    3-itemset
  • AC JOIN AD gt ACD
  • Subset test
  • e.g., Test all 2-subsets of ACD AC, AD, CD

Support Counting on the GPU
20/33
21
GPU-based Apriori
Input 1) Transaction Database 2) Minimum
support Output All frequent itemsets
Pure Bitmap-based Impl. (PBI)
Trie-based Impl. (TBI)
L1 All frequent 1-itemsets k 2 While (Lk-1
! empty) //Generate candidate
k-itemsets. Ck lt- Self join on Lk-1
Ck lt- (K-1)-Subset test on Ck //Generate
frequent k-itemsets Lk lt- Support
Counting on Ck k 1
L1 All frequent 1-itemsets k 2 While (Lk-1
! empty) //Generate candidate
k-itemsets. Ck lt- Self join on Lk-1
Ck lt- (K-1)-Subset test on Ck //Generate
frequent k-itemsets Lk lt- Support
Counting on Ck k 1
Itemsets bitmap Candidate generation on the GPU
Itemsets Trie Candidate generation on the CPU
Itemsets bitmap Candidate generation on the GPU
Transactions bitmap Support counting on the GPU
Transactions bitmap Support counting on the GPU
21/33
22
Pure Bitmap-based Impl. (PBI)
of items
Bitwise OR In Join (e.g., AB JOIN AD ABD)
Binary search In Subset test (e.g., 2-subsets
AB, AD, BD)
of itemsets
One GPU thread generates one candidate itemset.
22/33
23
GPU-based Apriori
Input 1) Transaction Database 2) Minimum
support Output All frequent itemsets
Pure Bitmap-based Impl. (PBI)
Trie-based Impl. (TBI)
L1 All frequent 1-itemsets k 2 While (Lk-1
! empty) //Generate candidate
k-itemsets. Ck lt- Self join on Lk-1
Ck lt- (K-1)-Subset test on Ck //Generate
frequent k-itemsets Lk lt- Support
Counting on Ck k 1
L1 All frequent 1-itemsets k 2 While (Lk-1
! empty) //Generate candidate
k-itemsets. Ck lt- Self join on Lk-1
Ck lt- (K-1)-Subset test on Ck //Generate
frequent k-itemsets Lk lt- Support
Counting on Ck k 1
Itemsets bitmap Candidate generation on the GPU
Itemsets Trie Candidate generation on the CPU
Itemsets Trie Candidate generation on the CPU
Transactions bitmap Support counting on the GPU
Transactions bitmap Support counting on the GPU
23/33
24
Trie-based Impl. (TBI)
Root
Depth 0
A
B
C
D
A
B
C
Depth 1
1-itemsets A, B, C, D
D
B
C
D
D
D
B
C
D
D
D
Depth 2
2-itemsets AB, AC, AD, BD, CD
AB
JOIN
AC
ABC
C
D
D
AB, AC, BC
Candidate 3-itemsets ABD, ACD
AB
JOIN
AD
ABD
AB, AD, BD
AC
JOIN
AD
ACD
On CPU 1, Irregular memory access 2, Branch
divergence
AC, AD, CD
24/33
25
Outline
  • Contribution
  • Introduction
  • Design
  • Evaluation
  • Conclusion

25/33
26
Experimental setup
26/33
Platform
Experimental datasets
Density Avg. Length / items
27
Apriori Implementations
Best Apriori implementation in FIMI repository.
(Frequent Itemset Mining Implementations
Repository)
27/33
28
TBI-CPU vs GOETHALS
Dense Dataset - Chess
Sparse Dataset- Retail
The impact of using bitmap representation for
transactions in support counting.
1.2x 25.7x
28/33
29
TBI-GPU vs TBI-CPU
Sparse Dataset- Retail
Dense Dataset - Chess
The impact of GPU acceleration in support
counting.
1.1x 7.8x
29/33
30
PBI-GPU vs TBI-GPU
Sparse Dataset- Retail
Dense Dataset - Chess
The impact of bitmap-based itemset and trie-based
itemset in candidate generation.
PBI-GPU is faster in dense dataset. TBI-GPU is
better in sparse dataset.
30/33
31
PBI-GPU/TBI-CPU vs BORGELT
Sparse Dataset- Retail
Dense Dataset - Chess
Comparison to the best Apriori implementation in
FIMI.
1.2x 24.2x
31/33
32
Comparison to FP-growth
With minsup 1, 60, and 0.01
PARSEC benchmark
32/33
33
Conclusion
  • GPU-based Apriori
  • Pure Bitmap-based impl.
  • Bitmap Representation for itemsets.
  • Bitmap Representation for transactions.
  • GPU processing.
  • Trie-based impl.
  • Trie Representation for itemsets.
  • Bitmap Representation for transactions.
  • GPU CPU co-processing.
  • Better than CPU-based Apriori.
  • Still worse than CPU-based FP-growth

33/33
34
Backup Slide
Time Breakdown
Time breakdown on dense dataset Chess
Time breakdown on dense dataset Retail
Write a Comment
User Comments (0)
About PowerShow.com