Title: Effective Replica Allocation
1 Effective Replica Allocation in Ad Hoc
Networks for Improving Data Accessibility
Takahiro Hara
(Proc. IEEE Infocom 2001,pp1568-1576)
Presented by Mingsheng Peng
2Contents
- Why Data Replication
- Related Work
- System Model
- Replica Allocation Methods
- Simulation Model
- Conclusion
3Why data Replication?
- Ad hoc networks temporarily constructed by only
mobile hosts. - Mobile host plays the role of a router, even if
source and destination are not in range, data
packets are forwarded by relaying - Since hosts move freely, disconnections occur
frequently, this causes frequent network division.
4Why data Replication?(contd...)
- Example, if some link goes down and the
network is split - Nodes on the right cannot access D2
- Nodes on the left cannot access D1
5Why data Replication?(contd...)
- A possible solution is by replicating data items
at mobile hosts which are not the owners of the
original data.
6Related Work
- Ad hoc network routing protocol Such as
DSDV,AODV,DSR,ZRP,CBRP - can only improve the connectivity among MHs which
are connected to each other, - but cannot do anything when the network is
divided as in the case in Figure 1. - Distributed database systems
- data Replication in database helps in reducing
response time - since failures occur infrequently, a small number
of replicas is sufficient - Mobile computing
- mobile hosts access databases at sites in a
fixed network, create replicas on mobile hosts - address issues of maintaining consistency with
low communication costs - assume only one-hop wireless communication
7System Model
- The system environment is assumed to be an Ad-hoc
network where - mobile hosts access data items held by other
mobile hosts (single or multiple hops) - each mobile host creates replicas of the data
and maintains the replicas in its memory - data item available if it is present locally or
if it is available at one of the neighbors - Assumptions
- unique host identifier Mj (set of all mobile
hosts M M1, M2,, Mm) - unique data identifier Dj (set of all data
items D D1, D2,, Dm) - Assume all data items are of the same size
- each host has a memory space of C data items for
replicas (excluding the space for holding
originals) - data remains the same and does not change
(simplifying assumption) - access frequencies of data items from each
mobile host is known and does not change
8Replica Allocation Methods
- Approach
- replicas are relocated in a specific period
(relocation period) - replica allocation is determined based on the
access frequency and network topology
9Three replica allocation methods
- Three replica allocation methods differ in
emphasis put on access frequency and network
topology. - SAF (Static Access Frequency) only the access
frequency to each data item is taken into
account. - DAFN (Dynamic Access Frequency and
Neighborhood) The access frequency to each data
item and the neighborhood among mobile hosts are
taken into account. - DCG (Dynamic Connectivity based Grouping) The
access frequency to each data item and the whole
network topology are taken into account.
10SAF(static Access Frequency)
- Each host creates replicas in descending order of
access frequencies - Advantages
- No control information regarding replicas need
to be exchanged - Once each host has its all necessary replicas,
there is no more replica relocation. - Low overhead and low traffic
- Disadvantages
- hosts with similar access characteristics have
the same replicas, but a MH can access data items
held by other connected MHs,and it is more
effective to share many kinds of replicas among
them.Thus it gives low data accessibility when
many hosts have the same or similar access
characteristics.
11SAF example
12DAFN(Dynamic Access Frequency and Neighborhood)
- The algorithm of this method is as follows
- 1) At a relocation period, each mobile host
broadcasts its host identifier and information on
access frequencies to data items. After all
mobile hosts complete the broadcasts, from the
received host identifiers, every host shall know
its connected mobile hosts. - 2) Each mobile host preliminary determines the
allocation of replicas based on the SAF method. - 3) In each set of mobile hosts which are
connected to each other, the following procedure
is repeated in the order of the breadth first
search from the mobile host with the lowest
suffix (i) of host identifier (Mi). When there is
duplication of a data item (original/replica)
between two neighboring mobile hosts, and if one
of them is the original, the host which holds the
replica changes it to another replica. If both of
them are replicas, the host whose access
frequency value to the data item is lower than
the other one changes the replica to another
replica. When changing the replica, among data
items whose replicas are not allocated at either
of the two hosts, a new data item replicated is
selected where the access frequency value to this
item is the highest among the possible items.
13DAFN(Dynamic Access Frequency and Neighborhood)
- Eliminates replica duplication among neighboring
hosts - The above procedure is executed every relocation
period - Overhead and traffic is much higher than SAF
- Does not completely eliminate replica
duplication - If network topology changes during the execution
of this method, replica relocation cannot
completed
14DAFN example
15DCG(Dynamic Connectivity based Grouping)
- Biconnected component A maximum partial subgraph
which is still connected if one of the vertices
is removed (high stability!) - The algorithm is as follows
- 1) At a relocation period, each mobile
host broadcasts its host identifier and
information on access frequencies to data items.
After all mobile hosts complete the broadcasts,
from the received host identifiers, every host
knows the connected mobile hosts. - 2) In each set of mobile hosts which are
connected to each other, from the mobile host
with the lowest suffix (i) of host identifier
(Mi), an algorithm to find biconnected components
is executed. Then, each biconnected component is
put to a group. If a mobile host belongs to more
than one biconnected component, i.e., the host is
an articulation point, it belongs to only one
group in which the corresponding biconnected
component is first found in executing the
algorithm. - 3) In each group, an access frequency of the
group to each data item is calculated as a
summation of access frequencies of mobile hosts
in the group to the item. The calculation is done
by the mobile host with the lowest suffix (i) of
host identifier (Mi) in the group.
16DCG(Dynamic Connectivity based Grouping)
- 4) In the order of the access frequencies of
the group, replicas of data items are allocated
until memory space of all mobile hosts in the
group becomes full. Here, replicas of data items
which are held as originals by mobile hosts in
the group are not allocated. Each replica is
allocated at a mobile host whose access frequency
to the data item is the highest among hosts that
have free memory space to create it. - 5) After allocating replicas of all kinds of
data items, if there is still free memory space
at mobile hosts in the group, replicas are
allocated in the order of access frequency until
the memory space is full. Each replica is
allocated at a mobile host whose frequency to the
data item is the highest among hosts that have
free memory space to create it and do not hold
the replica or its original. If there is no such
mobile host, the replica is not allocated.
17DCG(Dynamic Connectivity based Grouping)
- Data accessibility is expected to be higher since
replicas are shared among a group of hosts - Overhead and traffic higher than the other two
methods since it consists more steps and needs to
take the largest time among the three methods to
relocate replicas in a wide range. - the probability is higher that the network
topology changes during executing this method,
and in this case, the replica relocation cannot
be done at mobile hosts over disconnected links
18DCG example
19Simulation Model
- 50 ? 50 flatland
- Each host randomly moves in all directions
- Movement speed is randomly determined between 0
to d - Radio communication range is a circle of radius
R (1-19) fixed to 7 - Number of hosts Number of data items 40
- Each host has creates up to C replicas (1-39)
fixed to 10 - Access frequency of each host to Di is pi given
by one of the three cases - Case 1 pi 0.5(1 0.01i)
- Each host has same access characteristics, access
frequencies vary in a Small range - Case 2 pi 0.025i
- Each host has same access characteristics,
access frequencies vary in a Wide range - Case 3 pi is determined as a positive value
based on N( 0.5(10.01i) , s ) - larger the value of s, higher the difference in
the access characteristics of the hosts - Relocation period T (1-8192) fixed to 256
- Simulated for 59,000 time units and traffic
measured (traffic number of hops used for
relocating data)
20Conclusion
- Introduced replica allocation in ad-hoc networks
as a mechanism of improving data accessibility - Proposed 3 replica allocation methods that use
access patterns and the network topology - Simulation results show that DCG gives the
highest accessibility at the cost of increased
traffic and SAF has the least traffic with low
data accessibility - The replica allocation method depends on the
system configuration and access patterns
21Any Questions?