Title: Granary: Architecture of Object Oriented
1Granary Architecture of Object Oriented Internet
Storage Service
Weimin Zheng, Jinfeng Hu, Ming Li Department of
Computer Science and Technology, Tsinghua
University, P. R. China
13/09/2004
2What is Granary?
Storage providers
Granary Storage Service
Storage consumers
3Why Granary?
Improving data reliability
?
Improving data availability
?
Improving data access efficiency
?
4What makes Granary distinct?
Object Oriented Data Management and Access
?
Super Adaptivity to the System Environment
?
5What is Object-Oriented Data?
- All the data stored in Granary is in the format
of OBJECT
- An object is a set of ATTRIBUTES, the same with
that in OOP
6How to create and manage my data?
- Create a class determining what attributes its
objects contain
- Put objects into Granary by filling in the
attributes
- Afterward, access the objects by attribute-level
queries
7What is attribute-level query?
- Submit a query describing the desired objects,
like
all the employees whose level is larger than 5
all the employees whose identifier begins with
123
- Get the list of satisfactory objects
8Why Object-Oriented Data and Attribute-Level
Query?
Improve data access mode
?
- without data transformation between program and
storage
Simplify upper-application development
- Service provider needs to develop many convenient
and practical applications to promote the storage
product.
Accelerate development process
9What and Why Super Adaptivity?
Two system models are envisioned by far
?
- Grid-like a number of powerful and stable servers
- P2P-like large amounts of weak and ephemeral
nodes
Which one will be chosen by the future world is
unpredictable
?
- Perhaps it will be eventually determined by the
commercial model
Granary intends to be a general architecture
?
- Work well in arbitrary system environment,
dynamically
10How to Achieve the Super Adaptivity?
Let the performance (of the components) vary as a
function of
?
- the node number (the system scale)
- nodes dynamics (lifetime, session duration)
- nodes capacity (the heterogeneous distribution)
- Not ONE of them, but ALL of them
11An Example
- Previous DHT routing protocols are not entirely
reasonable
- The log(N) routing hops means the performance is
only related to the system scale
12PeerWindow - - overview
A node collection protocol bottom layer of
Granary
?
- A node pays B bandwidth and collection P pointers
Information of other nodes
- The rate r P / B is about 1000/kbps
(lifetime1hour)
- The more bandwidth is paid, the more pointers
are collected
- When the system turns stable, r rises
automatically and proportionally
13PeerWindow - - mapping
A node has a 128-bit nodeId and a level value l
?
Self-determined
An l-level node keeps pointers to all the nodes
whose nodeIds have a common l-bit-long prefix
with the local nodeId.
?
A point comprises the corresponding nodes
ip-address, nodeId, level, and some attached info
?
14PeerWindow - - example
15PeerWindow - - maintenance
Audience set of node 1010
0-level nodes
all
nodeId1
1-level nodes
Multicast direction
nodeId10
2-level nodes
nodeId101
3-level nodes
Audience set of node N0N1N2N3. Nodes in the
audience set are classified according to
different levels. Higher-level nodes know all
nodes at lower levels. As long as an event
message strictly flows from stronger nodes to
weaker nodes during the multicast process, it
must reach all the nodes in the audience set
eventually.
16PeerWindow - - properties
- From the view of the individual nodes
- Lightweight collecting a large number of
pointer at a very low cost
- Autonomic self-determining the bandwidth paid
for node collection
- Dynamic being able to adjust the bandwidth cost
at runtime
17PeerWindow - - usages
PeerWindow can be embedded in any peer-to-peer
systems for node collection or partner finding
?
Granary uses PeerWindow in these ways
?
- Finding powerful nodes, by looking at the node
levels
- Finding close-by nodes, by combining the GNP
coordinates
- Finding proper nodes for replica creation, by
capsulating nodes basic information into the
pointers
- Measuring average lifetime of the current nodes
and reporting it upward.
- Serving as a fundamental block in the DHT
routing, Tourist
18PeerWindow - - details
Jinfeng Hu, Ming Li, Haitao Dong, Weimin Zheng,
Dongsheng Wang. PeerWindow Looking Outside for
Peers Submitted for Infocom05 Available at
Granary homepage http//166.111.205.211/
19Tourist - - overview
A DHT routing protocol, constructed by two
symmetric PeerWindows
?
- A standard PeerWindow pointers to nearby nodes
in the nodeId space. (called pace entries)
- A reversed PeerWindow (using suffix-matching
instead of prefix-matching) pointers to remote
nodes in the nodeId space. (called air entries)
20Tourist - - illustration
Air entries
Pace entries
21Tourist - - properties
- From the view of the individual nodes
- Lightweight maintaining a large routing table
at a very low cost
- Efficient one/two-hop routing (millions of
nodes, lifetime1hour)
- Autonomic self-determining the routing-table
size dynamically
22Tourist - - details
Jinfeng Hu, Ming Li, Weimin Zheng, Dongsheng
Wang, Ning Ning, Haitao Dong. SmartBoa
Constructing p2p Overlay Network in the
Heterogeneous Internet Using Irregular Routing
Tables Third International Workshop on
Peer-to-Peer Systems (IPTPS 2004)
SmartBoa is a preliminary version of Tourist,
less powerful and less beautiful
?
23Data Placement - - illustration
Manager Jack Employee-information Tom
ObjectId
Hash(Manager Jack Employee-information
Tom)52B3
hashId
24Data Placement - - comments
We dont directly put objects into DHT because
?
- DHT is not suitable for large-object store
(incurring massive data movement under churn)
- DHTs replica placement is not fully flexible
- A Granary user can appoint the replica positions
by himself
- The system can also move/create a replica to
improve the availability, to accelerate access
efficiency, or to balance the servers load
- DHT needs complex load balancing when storing
large objects
25Data Placement - - class data
Manager Jack Employee-information
ClassId
hashId
Hash(Manager Jack Employee-information)A8E4
key
Class list
87.132.44.79
value
112.72.34.125
DHT layer
202.105.65.211
nodeIdA8F1
Class guider
87.132.44.79
112.72.34.125
202.105.65.211
Class node
Class node
Object index manage
Class node
Object index manage
Object index manage
Store and manage indexes for each visible
attribute of string or numeric type
26PB-link tree - - question
How to split the B tree?
27PB-link tree - - trivial way
- Assign the leaf nodes to the assistants
according to the range of the attribute value and
the assistants capacity
- Multi-attribute query needs large list transfer.
employees whose identifier begins with 12 and
level is larger than 5
28PB-link tree - - our solution
Each item of the index contains the objects
hashId
All the index items related to a given object
(within different B trees) must reside in the
same assistant
29How is the project going?
The first version will be finished in this month
?
It intends to work as a public service in THU
?
Deployment in the CNGRID and in the
ImagineOne.NET project is also in the plan
?
30THANKS