A Scalable Distributed Information Management System (SDIMS) - PowerPoint PPT Presentation

1 / 55

About This Presentation

Title:

A Scalable Distributed Information Management System (SDIMS)

Description:

Astrolabe. A single logical aggregation tree that mirrors a system administrative hierarchy. ... Back to Astrolabe. Related Works (cont.) DHT. SkipNet, CAN, ... – PowerPoint PPT presentation

Number of Views:120

Avg rating:3.0/5.0

Slides: 56

Provided by: net147

Category:

more less

Transcript and Presenter's Notes

Title: A Scalable Distributed Information Management System (SDIMS)

1
A Scalable Distributed Information Management
System (SDIMS)

P. Yalagandula, M. Dahlin
cs.utexas.edu
SIGCOMM 2004

2
Outline

Introduction
Goal Aggregation
Innovation
Flexibility
Scalability
Robustness
Implementation
Evaluation
Conclusions

3
Introduction

Why SDIMS ?
Monitor, querying, reacting to changes are core
components of applications such as system
management, service placement, data sharing and
caching, etc.
SDIMS in a networked system would provide a
distributed operating system backbone and
facilitate the development and deployment of new
distributed service.

4
Introduction (cont.)

Fundamental
Hierarchical aggregation
A node access detailed views of nearby
information and summery views of global
information.
A hierarchical system aggregate information
through reduction trees.

5
Introduction (cont.)

A SDIMS should have four properties.
Scalable
Flexibility
Administrative isolation
Robustness

6
Scalable

SDIMS should accommodate large numbers of nodes.
SDIMS should allow applications to install and
monitor large numbers of data attributes.

7
Flexibility

SDIMS should accommodate a range of applications
and attributes.
Read-dominated attribute (rarely change)
Num of CPUs
Write-dominated attribute (change often)
Num of processes
SDIMS should leave the policy decision of tuning
replication to applications.

8
Administrative isolation

Nodes can be arranged in an organizational or
administrative hierarchy.
Domain-based control.
Monitor
Query

9
Robustness

SDIMS should adapt to reconfigurations in a
timely fashion when node failures or
disconnections.
SDIMS should provide mechanisms so that
applications can tradeoff the cost of adaptation
with consistency level of aggregated results when
reconfigurations occur.

10
Related Works

Astrolabe
A single logical aggregation tree that mirrors a
system administrative hierarchy.
A general interface for installing new
aggregation functions.
An unstructured gossip protocol for disseminating
information and replicating all aggregated
attribute values for a sub-tree to all nodes in
the sub-tree.

11
Related Works (cont.)

Any nodes can answer queries by using local
information.
Not scalable. (replication)
Not flexibility. (Type of attribute)
Solution P2P

Go to DHT
12
Tree

For each level in the hierarchy, the agent
maintains a record with the list of child zones
(and their attributes), and which child zone
represents its own zone (self).

Back to Astrolabe
13
Gossip protocol

Periodically, each agent selects some other agent
at random and exchanges state information with
it.
If the two agents are in the same zone, the state
exchanged relates to MIBs in that zone.
If the two agents are in different zone, they
exchange state associated with the MIBs of their
least common ancestor zone.

Back to Astrolabe
14
Related Works (cont.)

DHT
SkipNet, CAN, Pastry, Chord, Tapestry

15
Problem

How to scalable map different attributes to
different aggregation tree in a DHT mesh
?physical network vs overlay network
How to provide flexibility in the aggregation to
accommodate different application requirement
?flexible API for installing and controlling
system

16
Problem ?

How to adapt a DHT mesh to attain administrative
isolation property ? virtual organization
How to provide robustness without unstructured
gossip and total replication ?cache
pre-computing or on-demand re-aggregation

17
Aggregation Abstraction
18
Aggregation Abstraction

Each physical node in the system is a leaf in the
tree.
An internal non-leaf, which we call virtual node,
is simulated by one or more physical nodes at the
leaves of the sub-tree for which the virtual node
is the root.

19
Aggregation Abstraction (cont.)

Each physical node has local data stored as a set
of (attributeType, attributeName, value) tuples.
The system associates an aggregation function
ftype with each attribute type.

20
Aggregation Abstraction (cont.)

For each level-i sub-tree Ti in the system has an
aggregate value Vi, type, name for each
(attributeType, attributeName) pair.
The aggregate value for a level-i sub-tree Ti is
the aggregate function for the type, ftype
computed across the aggregate values of each of
Ti s k children. Vi, type, name ftype

21
Aggregation Abstraction (cont.)

Example of ftype
Avg(V1, , Vn)1/n ??
SUM(V1, , Vn) ??
Aggregation function satisfy the hierarchical
computation property

22
Aggregation Abstraction (cont.)
node
Virtual node
23
Innovation

Flexibility
Scalability
Administrative isolation
Robustness

24
Flexibility

Operation API
Install
Update
Prob

25
Install Operation

The Install operation installs an aggregation
function in the system.

26
Prob Operation
?????reconfigure,????cache
27
Prob Operation (cont.)

When node A issues a continuous probe at level l
for an attribute, then updates for the attribute
at any node in As level-l ancestors subtree are
aggregated up to level l and is propagated down
along the path from the ancestor to A.

28
Update and Prob Operation
29
Update and Prob Operation (cont.)
30
Update Operation API

Update-UpK-downj Up to kth level and propagates
the aggregate values of a node at level l
downward for j levels. (l k)

31
Operation API
K
Update-UpK-downj
Level-4
Level-3
Level-2
L
Level-1
J
Level-0
32
Dynamic Adaptation

A SDIMS implementation can dynamically adjust its
up/down strategies for an attribute based on its
measured read/write frequency.

33
Scalability

SDIMS defines the aggregation abstraction to mesh
with its underlying scalable DHT system.
SDIMS refines the basic DHT abstraction to form
an Autonomous DHT (ADHT) to achieve the
administrative isolation properties

34
Mapping to DHT
1
35
Mapping to DHT

Aggregating an attribute along the aggregation
tree is corresponding to DHTtreek for k
hash(attribute type, attribute name)
Different attributes will be aggregated along
different trees.

36
Administrative isolation

For security
Updates and Probes are not accessible outside the
domain
For availability
Queries for values in a domain are not affected
by failures of nodes in other domains
For efficiency
Domain-scoped queries can be simple and efficient.

37
Administrative isolation

Autonomous DHT
Path Locality Search paths should always be
contained in the smallest possible domain.
Path Convergence Search paths for a key from
different nodes in a domain should converge at a
node in that domain.

38
Administrative isolation
Domain univ.
Domain dept.
L0 host L2 univ.
isolation property is violated
39
Administrative isolation
Domain dept.
Domain univ.
Autonomous DHT
L0 host L2 dept.
40
Robustness

ADHT
Distributed Computing (?)
Aggregation Management Layer (AML)
Lazy re-aggregation
On-demand Re-aggregation
Replication in Space

41
2 Layer arch. ADHT and AML

The ADHT layer informs the AML layer about
reconfigurations in the network.
NewParent
FailedChild
NewChild

42
Implementation
DifferentOverlay(?)
43
MIB

Child MIBs containing raw aggregate values
gathered from children.
Reduction MIB containing locally aggregated
values across this raw information
Ancestor MIB containing aggregate values
scattered down from ancestors.

44
Implementation
parent
child
45
Implementation (cont.)

attribute key Use for retrieving data by
aggregation function.
(attributetype, attribute name)

46
Implementation (cont.)

A node acts
as leaf for all attribute keys
as a level-1 subtree root for keys whose hash
matches the nodes ID in b prefix bits.
as a level-i subtree root for keys whose hash
matches the nodes ID in the initial i b bits.
as the systems global root for attribute keys
whose hash matches the nodes ID in more prefix
bits than any other node

47
Evaluation
?????MIB
????Node?MIB
Up-All, Down 0
Monitor?attribute???
Monitor?attribute???
48
Evaluation (cont.)
the session size is set to 8 (domain size), the
branching factor is set to 16
Message size
nodes
49
Evaluation (cont.)
Bf Branch Factor
Average path length to root
50
Evaluation (cont.)
Bf Branch Factor
51
Evaluation (cont.)
440
700
40
100
52
Evaluation (cont.)
283?node???, ??node?10
53
Evaluation (cont.)
Re-aggregation
275s? root killed
54
Conclusion

Scalability with respect to both nodes and
attributes through a new aggregation abstraction
that helps leverage DHT's internal trees for
aggregation.
Flexibility through a simple API that lets
applications control propagation of reads and
writes.

55
Conclusion (cont.)

Administrative isolation through simple
augmentations of current DHT algorithms.
Robustness to node and network reconfigurations
through lazy reaggregation, on-demand
reaggregation, and tunable spatial replication.

Write a Comment

User Comments (0)