Implementing Scalable Tree-based Algorithms using MRNet

About This Presentation

Title:

Implementing Scalable Tree-based Algorithms using MRNet

Description:

Ting Chen. Mark Cowlishaw. 2. Introduction. Background on MRNet. Our Applications. Reverse index ... Each word points to a list of document IDs containing the word. ... – PowerPoint PPT presentation

Number of Views:25

Avg rating:3.0/5.0

Slides: 11

Provided by: Syste98

Learn more at: https://pages.cs.wisc.edu

Category:

more less

Transcript and Presenter's Notes

Title: Implementing Scalable Tree-based Algorithms using MRNet

1
Implementing Scalable Tree-based Algorithms
using MRNet

Ting ChenMark Cowlishaw

2
Introduction

Background on MRNet
Our Applications
Reverse index
Online queries
Description of Experiments
Progress
Next steps

3
Background MRNet Roth, Arnold, Miller03

Tree-Based Overlay Network
Nodes of a distributed application are arranged
in a tree-structure
Leaves producing data that is aggregated and
filtered by higher levels of the tree
Separation of programs and their running tree
topology
A TBON program can run on tree-network of
different topologies

4
Reverse indexes for Keyword Queries

A keyword query is a list of words lt w1,w2,
...,wn gt .
A typical reverse index is a list of words
Each word points to a list of document IDs
containing the word.
A document ID list can either be sorted by
document names or by the number of times the word
appears in the document.

Wisconsin
Badgers
DocId No. of Occurences
B.htm C.htm . 6 4
DocId No. of Occurences
A.htm B.htm C.htm . 2 7 5
5
On-line queries N-best frequency (N2)
DocC 7
DocB 3
DocD 0
DocF 6
DocA 5
DocD 0
DocC 7
DocF 6
DocB 3
DocA 5
DocE 2
6
Objective / Experiments

How does tree topology affect application
performance
Macro-benchmarks throughput/time-to-completion
for index building and response time for on-line
queries
Micro-benchmarks the amount of total IO, I/O
performed for each node, data transferred
Scale-Up and Speed-Up curves with the increase of
cluster nodes
Is Multiple-level ( gt 2) helpful?

7
Progress - data