Title: Ex-MATE:%20Data-Intensive%20Computing%20with%20Large%20Reduction%20Objects%20and%20Its%20Application%20to%20Graph%20Mining
1Ex-MATE Data-Intensive Computing with Large
Reduction Objects and Its Application to Graph
Mining
- Wei Jiang and Gagan Agrawal
2Outline
- Background
- System Design of Ex-MATE
- Parallel Graph Mining with Ex-MATE
- Experiments
- Related Work
- Conclusion
3Outline
- Background
- System Design of Ex-MATE
- Parallel Graph Mining with Ex-MATE
- Experiments
- Related Work
- Conclusion
4Background (I)
- Map-Reduce
- Simple API map and reduce
- Easy to write parallel programs
- Fault-tolerant for large-scale data centers
- Performance?
- Always a concern for HPC community
- Generalized Reduction
- First proposed in FREERIDE that was developed at
Ohio State 2001-2003 - Shared a similar processing structure
- The key difference lies in a programmer-managed
reduction-object - Better performance?
5Map-Reduce Execution
6Comparing Processing Structures
- Reduction Object represents the intermediate
state of the execution - Reduce func. is commutative and associative
- Sorting, grouping.. .overheads are eliminated
with red. func/obj.
7Our Previous Work
- A comparative study between FREERIDE and Hadoop
- FREERIDE outperformed Hadoop with factors of 5 to
10 - Possible reasons
- Java VS C? HDFS overheads? Inefficiency of
Hadoop? - API difference?
- Developed MATE (Map-Reduce system with an
AlternaTE API) on top of Phoenix from Stanford - Adopted Generalized Reduction
- Focused on API differences
- MATE improved Phoenix with an average of 50
- Avoids large set of intermediate pairs between
Map Reduce - Reduces memory requirements
8Extending MATE
- Main issues of the original MATE
- Only works on a single multi-core machine
- Datasets should reside in memory
- Assumes the reduction object MUST fit in memory
- This paper extended MATE to address these
limitations - Focus on graph mining an emerging class of apps
- Require large-sized reduction objects as well as
large-scale datasets - E.g., PageRank could have a 8GB reduction object!
- Support of managing arbitrary-sized reduction
objects - Also reading disk-resident input data
- Evaluated Ex-MATE using PEGASUS
- PEGASUS A Hadoop-based graph mining system
9Outline
- Background
- System Design of Ex-MATE
- Parallel Graph Mining with Ex-MATE
- Experiments
- Related Work
- Conclusion
10System Design and Implementation
- System design of Ex-MATE
- Execution overview
- Support of distributed environments
- System APIs in Ex-MATE
- One set provided by the runtime
- operations on reduction objects
- Another set defined or customized by the users
- reduction, combination, etc..
- Runtime in Ex-MATE
- Data partitioning
- Task scheduling
- Other low-level details
11Ex-MATE Runtime Overview
- Basic one-stage execution
12Implementation Considerations
- Support for processing very large datasets
- Partitioning function
- Partition and distribute to a number of nodes
- Splitting function
- Use the multi-core CPU on each node
- Management of a large reduction-object (R.O.)
- Reduce disk I/O!
- Outputs (R.O.) are updated in a demand-driven way
- Partition the reduction object into splits
- Inputs are re-organized based on data access
patterns - Reuse a R.O. split as much as possible in memory
- Example Matrix-Vector Multiplication
13A MV-Multiplication Example
Input Vector
(1, 1)
(1, 2)
Output Vector
Input Matrix
(2, 1)
14Outline
- Background
- System Design of Ex-MATE
- Parallel Graph Mining with Ex-MATE
- Experiments
- Related Work
- Conclusion
15GIM-V for Graph Mining (I)
- Generalized Iterative Matrix-Vector
Multiplication(GIM-V) - Proposed at CMU at first
- Similar to the common MV Multiplication
- MV Mul.
- Three operations in
- GIM-V
- combine m(i, j) and v(j)
- Not have to be a multiplication
- combineAll n partial results for the element i
- Not have to be the sum
- assign v(new) to v(i)
- The previous value of v(i) is updated by a new
value
Multiplication
Sum
Assignment
16GIM-V for Graph Mining (II)
- A set of graph mining applications can fit into
this GIM-V - PageRank, Diameter Estimation, Finding
Connected Components, Random Walk with Restart,
etc.. - Parallelization of GIM-V
- Use Map-Reduce in PEGASUS
- A two-stage algorithm two consecutive
map-reduce jobs - Use Generalized Reduction in Ex-MATE
- A one-stage algorithm simpler code
17GIM-V Example PageRank
- PageRank is used by Google to calculate the
relative importance of web-pages
- Direct implementation of GIM-V v(j) is the
ranking value - The three customized operations are
Multiplication
Sum
Assignment
18GIM-V Other Algorithms
- Diameter Estimation HADI is an algorithm to
estimate the diameter of a given graph - The three customized operations are
- Finding Connected Components HCC is a new
algorithm to find the connected components of
large graphs - The three customized operations are
Multiplication
Bitwise-or
Bitwise-or
Multiplication
Minimal
Minimal
19Parallelization of GIM-V (I)
- Using Map-Reduce Stage I
- Map
Map M(i,j) and V(j) to reducer j
20Parallelization of GIM-V (II)
- Using Map-Reduce Stage I (cont.)
- Reduce
Map combine2(M(i,j) , V(j)) to reducer i
21Parallelization of GIM-V (III)
- Using Map-Reduce Stage II
- Map
22Parallelization of GIM-V (IV)
- Using Map-Reduce Stage II (cont.)
- Reduce
23Parallelization of GIM-V (V)
- Using Generalized Reduction in Ex-MATE
- Reduction
24Parallelization of GIM-V (VI)
- Using Generalized Reduction in Ex-MATE
- Finalize
25Outline
- Background
- System Design of Ex-MATE
- Parallel Graph Mining with Ex-MATE
- Experiments
- Related Work
- Conclusion
26Experiments Design
- Applications
- Three graph mining algorithms
- PageRank, Diameter Estimation, and Finding
Connected Components - Evaluation
- Performance comparison with PEGASUS
- PEGASUS provides a naïve version and an optimized
version - Speedups with an increasing number of nodes
- Scalability speedups with an increasing size of
datasets - Experimental platform
- A cluster of multi-core CPU machines
- Used up to 128 cores (16 nodes)
July 7, 2019
26
27Results Graph Mining (I)
- PageRank 16GB dataset a graph of 256 million
nodes and 1 billion edges
Avg. Time Per Iteration (min)
10.0 speedup
of nodes
28Results Graph Mining (II)
- HADI 16GB dataset a graph of 256 million nodes
and 1 billion edges
Avg. Time Per Iteration (min)
11.0 speedup
of nodes
29Results Graph Mining (III)
- HCC 16GB dataset a graph of 256 million nodes
and 1 billion edges
Avg. Time Per Iteration (min)
9.0 speedup
of nodes
30Scalability Graph Mining (IV)
- HCC 8GB dataset a graph of 256 million nodes
and 0.5 billion edges
Avg. Time Per Iteration (min)
1.7 speedup
1.9 speedup
of nodes
31Scalability Graph Mining (V)
- HCC 32GB dataset a graph of 256 million nodes
and 2 billion edges
Avg. Time Per Iteration (min)
1.9 speedup
2.7 speedup
of nodes
32Scalability Graph Mining (VI)
- HCC 64GB dataset a graph of 256 million nodes
and 4 billion edges
Avg. Time Per Iteration (min)
1.9 speedup
2.8 speedup
of nodes
33Observations
- Performance trends are similar for all three
applications - Consistent with the fact that all three
applications are implemented using the GIM-V
method - Ex-MATE outperforms PEGASUS significantly for all
three graph mining algorithms - Reasonable speedups for different datasets
- Better scalability for larger datasets with a
increasing number of nodes
July 7, 2019
33
34Outline
- Background
- System Design of Ex-MATE
- Parallel Graph Mining with Ex-MATE
- Experiments
- Related Work
- Conclusion
35Related Work Academia
- Evaluation of Map-Reduce-like models in various
parallel programming environments - Phoenix-rebirth for large-scale multi-core
machines - Mars for a single GPU
- MITHRA for GPGPUs in heterogeneous platforms
- Recent IDAV for GPU clusters
- Improvement of Map-Reduce API
- Integrating pre-fetch and pre-shuffling into
Hadoop - Supporting online queries
- Enforcing a less restrictive synchronization
semantics between Map and Reduce
July 7, 2019
35
36Related Work Industry
- Googles Pregel System
- Map-reduce may not so suitable for graph
operations - Proposed to target graph processing
- Open source version HAMA project in Apache
- Variants of Map-Reduce
- Dryad/DryadLINQ from Microsoft
- Sawzall from Google
- Pig/Map-Reduce-Merge from Yahoo!
- Hive from Facebook
July 7, 2019
36
37Outline
- Background
- System Design of Ex-MATE
- Parallel Graph Mining with Ex-MATE
- Experiments
- Related Work
- Conclusion
38Conclusion
- Ex-MATE supports the management of reduction
objects of arbitrary sizes - Deals with disk-resident reduction objects
- Outperforms PEGASUS for both the naïve and
optimized implementations for all three graph
mining application - Has a simpler code
- Offers a promising alternative for developing
efficient data-intensive applications, - Uses GIM-V for parallelizing graph mining
39Thank You, and Acknowledgments
- Questions and comments
- Wei Jiang - jiangwei_at_cse.ohio-state.edu
- Gagan Agrawal - agrawal_at_cse.ohio-state.edu
- This project was supported by