EFFECTIVE LOADBALANCING VIA MIGRATION AND REPLICATION IN SPATIAL GRIDS - PowerPoint PPT Presentation

About This Presentation
Title:

EFFECTIVE LOADBALANCING VIA MIGRATION AND REPLICATION IN SPATIAL GRIDS

Description:

... overloaded if its load exceeds that of its neighbours by 10 ... Neighbours with enough disk space reply to L with their load status and disk space information. ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 19
Provided by: anirban7
Category:

less

Transcript and Presenter's Notes

Title: EFFECTIVE LOADBALANCING VIA MIGRATION AND REPLICATION IN SPATIAL GRIDS


1
EFFECTIVE LOAD-BALANCING VIA MIGRATION AND
REPLICATION IN SPATIAL GRIDS
  • ANIRBAN MONDAL
  • KAZUO GODA
  • MASARU KITSUREGAWA
  • INSTITUTE OF INDUSTRIAL SCIENCE
  • UNIVERSITY OF TOKYO, JAPAN
  • anirban,kgoda,kitsure_at_tkl.iis.u-tokyo.ac.jp

2
PRESENTATION OUTLINE
  • INTRODUCTION
  • RELATED WORK
  • SYSTEM OVERVIEW
  • MIGRATION AND REPLICATION
  • LOAD-BALANCING
  • PERFORMANCE STUDY
  • CONCLUSION AND FUTURE WORK

3
INTRODUCTION
  • Prevalence of spatial applications
  • Resource management, development planning,
    emergency planning, scientific research, GIS,
    CAD,VLSI
  • Unprecedented growth of available spatial data at
    geographically distributed locations ? the need
    for efficient networking
  • Emergence of GRID computing and powerful networks
  • Motivates the design of a SPATIAL GRID .

4
CHALLENGES
  • Scale
  • Heterogeneity
  • Dynamism
  • Cross-domain administrative issues
  • Efficient search and load-balancing mechanisms
  • We focus on load-balancing.
  • Load-balancing in GRIDs is much more complicated
    than in traditional environments.

5
LOAD-BALANCING
  • Some nodes become hot
  • Skewed Workloads
  • Dynamic access patterns
  • These hot nodes become bottlenecks
  • Increased waiting times ? High response times
  • MAIN CONTRIBUTIONS
  • Viewing a spatial GRID as comprising several
    clusters
  • Each cluster is a LAN
  • Proposal of an inter-cluster load-balancing
    algorithm which uses migration/replication of
    data.
  • Presentation of a scalable technique for dynamic
    data placement.

6
RELATED WORK
  • Ongoing GRID projects
  • Earth Systems Grid (ESG)
  • NASA Information Power Grid (IPG)
  • Grid Physics Network (GriPhyN)
  • European DataGrid.
  • Binding of execution and storage sites together
    into I/O communities Thain01
  • Data-movement system (Kangaroo)
  • Load-balancing
  • STATIC (BUBBA, tile technique)
  • DYNAMIC (Disk cooling)
  • Job (Process) MIGRATION in CONDOR
  • Spatial indexes R-tree Guttman84

7
SYSTEM OVERVIEW
  • Viewing the GRID as a set of clusters
  • Distance between two clusters
  • Communication time between cluster leaders
  • Neighbours
  • Definition of Load
  • Number of disk I/Os in a certain time interval
  • Normalize w.r.t CPU power
  • Cluster leaders
  • Coordinate cluster activities
  • Maintain meta-information
  • Data stored at its own cluster its neighbours
  • Hotspot detection via access statistics
  • Use only recent statistics

8
DATA MOVEMENT IN GRIDs
  • MIGRATION REPLICATION
  • Unlike replication, migration implies deletion of
    hot data at the source node.
  • Which option is better Migration or Replication
  • Load-balancing
  • Data Availability
  • Disk space usage
  • Periodic cleanup
  • REPLICA CONSISTENCY ??
  • Decisions concerning migration/replication should
    be taken during run-time.

9
DATA MOVEMENT (Cont.)
  • Impact of heterogeneity on data movement
  • Administrative policies (e.g., security)
  • Data management techniques (Indexing, hotspot
    detection, etc)
  • CPU
  • Disk space
  • Moving data entails movement of indexes.
  • To address variations in indexing schemes, we
    extract data from the index at a node and rebuild
    the index at the destination node.
  • Each node has two indexes
  • Index for its own data
  • Index for moved data

10
DATA MOVEMENT (Cont.)
  • Impact of variations in disk space on data
    movement
  • Pushing non-hot data to large capacity peers
  • Large-sized data migration
  • Small-sized data replication
  • Replicating small-sized hot data at small
    capacity peers
  • Large-sized hot data migration to large capacity
    peers if peers are available, otherwise
    replication.
  • Deletion of infrequently accessed replicas

11
INTER-CLUSTER LOAD-BALANCING
  • Periodic exchange of load info between neighbours
  • Leader L considers itself to be overloaded if its
    load exceeds that of its neighbours by 10.
  • L determines its hot regions and informs its
    neighbours about disk space requirement of hot
    regions.
  • Number of hot regions depends upon load
    imbalance.
  • Neighbours with enough disk space reply to L with
    their load status and disk space information.
  • These leaders are sorted (asc) in List1 based on
    their loads.
  • L assigns hot regions to members of List 1 in a
    round-robin manner.
  • The hottest region is moved to first member of
    List1, the second hottest region is moved to
    second member of List1 and so on.

12
PERFORMANCE STUDY
  • 16 SUN workstations, each of which is a 143 MHz
    Sun UltraSparc I processor (256 MB RAM) running
    Solaris 2.5.1 operating system.
  • These are connected by relatively high speed
    switch (200 Mbyte/s), the APnet.
  • Each cluster is modeled by a workstation node.
  • We simulated a transfer rate of 1 Mbit/second
    among the clusters.
  • We implemented an R-tree on each of the clusters
    to organize the data allocated to each cluster.
  • A real dataset (Greece Roads)
  • Each cluster had more than 200000 data
    rectangles.
  • Zipf distribution was used to model workload
    skews.
  • We investigated only migration in this proposal.

13
PERFORMANCE OF OUR PROPOSED SCHEME
14
SNAPSHOT OF LOAD-BALANCING FOR ZIPF FACTOR OF 0.1
15
VARIATIONS IN WORKLOAD SKEW
16
SNAPSHOT OF LOAD DISTRIBUTION FOR ZIPF FACTOR OF
0.5
17
SUMMARY
  • Huge amounts of available spatial data worldwide
    coupled with the emergence of GRID technologies
    and powerful networks motivate the design of a
    spatial GRID.
  • For performance reasons, effective load-balancing
    is necessary in such a spatial GRID.
  • We view a GRID as a set of clusters.
  • Proposal of a dynamic inter-cluster
    load-balancing strategy via migration/replication
    in GRIDs

18
FUTURE SCOPE OF WORK
  • FAIRNESS IN LOAD-BALANCING
  • GRANULARITY OF DATA MOVEMENT
  • DETAILED PERFORMANCE STUDY
  • REPLICATION
  • DIFFERENT WORKLOAD TYPES
  • SCALABILITY
  • INTEGRATION INTO EXISTING GRIDs
Write a Comment
User Comments (0)
About PowerShow.com