Performance Analysis of the Globus Toolkit Monitoring and Discovery Service MDS2 - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Performance Analysis of the Globus Toolkit Monitoring and Discovery Service MDS2

Description:

... caching at MDS2 GRIS can bypass the performance bottleneck and support ... Placing primary components at well-connected sites can improve performance too ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 25
Provided by: xuehai7
Category:

less

Transcript and Presenter's Notes

Title: Performance Analysis of the Globus Toolkit Monitoring and Discovery Service MDS2


1
Performance Analysis of the Globus Toolkit
Monitoring and Discovery Service (MDS2)
  • Xuehai Zhang, University of Chicago
  • Dr. Jennifer Schopf, Argonne National Lab

2
Grid Monitoring and Information Services
  • Why are they important?
  • Resource selection, scheduling
  • Prediction, system status monitoring, event
    notification
  • Few quantitative performance studies have been
    done
  • Their performance study will help in
  • Deployment
  • Performance tuning
  • Development of future systems

3
Performance Study of MDS2
  • MDS2 is the most common information and
    monitoring service in production Grids
  • Zhang, Freschl, and Schopf (HPDC 03)
  • Evaluated MDS2 scalability and compared with two
    other services, R-GMA and Hawkeye
  • The approach is coarse grain and focuses on
    end-to-end performance only
  • This study
  • Revisits MDS2 scalability at a finer granularity
    using NetLogger instrumentation
  • Enables us to better understand what and where
    are the performance bottlenecks

4
Outline
  • Problem
  • MDS2 and NetLogger instrumentation
  • Experimental setup
  • Experiment results and analysis
  • Conclusion and future work

5
Monitoring and Discovery Service (MDS2)
  • Part of the Globus Toolkit
  • Based on Lightweight Directory Access Protocol
    (LDAP)
  • Uses a hierarchical architecture
  • Grid Index Information Service (GIIS)
  • Grid Resource Information Service (GRIS)
  • Information Providers (IPs)

6
NetLogger Instrumentation
  • NetLogger is a toolkit to debug distributed
    applications and identify bottleneck
  • Developed at Lawrence Berkeley National Lab
  • Instruments applications by logging interesting
    events at every critical point
  • We used NetLogger to divide the end-to-end path
    of a MDS2 query into 7 phases

7
Outline
  • Problem
  • MDS2 and NetLogger instrumentation
  • Experimental setup
  • Experiment results and analysis
  • Conclusion and future work

8
Performance Topics
  • Topic 1 MDS2 GRIS vs. User
  • Two configuration scenarios
  • GRIS always caches data
  • GRIS never caches data
  • Topic 2 MDS2 GIIS vs. User
  • As a directory server, GIIS is configured to
    always cache data

1
2
9
Experimental Setup
  • We deployed and studied MDS v2.2 and v2.4
  • Both were instrumented with NetLogger v2.0.13
  • Server-sided Testbed Lucky nodes at ANL
  • 7 dual-processor Linux boxes
  • Hostname lucky0,1,3-7.mcs.anl.gov
  • lucky0 and lucky6 ran Linux kernel 2.4.10 and the
    rest ran kernel 2.4.19
  • Two 1133 MHz Intel PIII CPUs (with a 512KB cache
    per CPU) and 512 MB RAM
  • Interconnect is 100 Mbps Ethernet

10
Experimental Setup (contd)
  • Client-sided Testbed at University of Chicago
    (UC)
  • 20 Linux boxes
  • 15 machines equipped with a 1208MHz uni-processor
    and 256 MB RAM
  • 5 machines with 756 MHz CPUs and 256 MB RAM
  • The simulation of concurrent users
  • Simulated by multiple processes evenly
    distributed to all client machines
  • Continuous queries separated by 1-second wait
    period
  • 100Mbps network connects ANL and UC

11
Performance Metrics
  • Throughput
  • The average number of requests processed by a
    MDS2 service component per second
  • Observed Response Time (ORT) and Request
    Processing Time (RPT)
  • ORT the average time from the user sends out a
    request till it gets the response calculated at
    the client side
  • RPT the average time for a MDS2 service
    component to handle a user request calculated at
    the server side
  • ORT is always greater than RPT
  • ORTTClient-connect TClient-Bind RPT
    TClient-EndConnect
  • RPTTServer-InitSearch TServer-SearchIndex
    TServer-Invoking TServer-GenResult

12
Performance Metrics (contd)
  • CPU_Load
  • CPU-Load CPU_User CPU_System
  • CPU_User the percent of CPU time used user mode
  • CPU_System the percent of CPU time in system
    mode
  • Load1
  • Average number of processes ready to run during
    the last 1 minute

13
Outline
  • Problem
  • MDS2 and NetLogger instrumentation
  • Experimental setup
  • Experiment results and analysis
  • Conclusion and future work

14
Experiment 1GRIS Scalability (with users)
  • 10 reporting Information Providers
  • Up to 600 users
  • 10 minutes querying
  • Query asks for all the data from all Information
    Providers (10KB)
  • Each data point is the average of 100 data

Caching/ Without caching
1
15
Experiment 1 ResultGRIS Query Phases Performance
  • Without data caching, the bottleneck lies in the
    server-sides Server-Invoking phase
  • it is due to the high cost of invoking
    Information Providers
  • GRIS performance with data caching depends on the
    client-side Client-Connect time
  • V2.4 GRIS outperforms V2.2 GRIS attributes to
    better memory use

16
Experiment Set 1 ResultLoad1
  • GRIS host has higher load with more users because
    more intensive contention among more queries
  • GRIS without data caching casts lower load than
    GRIS with data caching because processes are
    blocked waiting for resources

17
Experiment 1 Summary
  • Enable caching at MDS2 GRIS can bypass the
    performance bottleneck and support more users
  • MDS2 GRIS should run on a well-connected machine
  • Duplicating MDS2 GRIS can improve performance

18
Experiment 2GIIS Scalability (with users)
  • 5 reporting GRIS each with 10 Information
    Providers
  • Up to 600 users
  • 10 minutes querying
  • Query asks for all the data from all the
    reporting GRIS (50KB)
  • Each data point is the average of 100 data

Caching
2
19
Experiment 2 Result GIIS Query Phases Performance
  • GIIS exhibits a high scalability generally due to
    data caching
  • However, it is constrained by the client-sides
    Client-Connect phase
  • V2.4 GIIS outperforms V2.2 GIIS
  • GIIS with data caching is similar to but more
    efficient than GRIS with data caching

20
Experiment 2 ResultLoad1
  • GIIS host experiences a higher load with the
    increasing number of users

21
Experiment 2 Summary
  • GIIS with data caching has a high scalability and
    provides efficient directory service
  • When serving a large number of users, its
    performance is constrained by the users
    connection time
  • Duplicate the GIIS to keep the quality of service
    when there are a larger number of users

22
Conclusion
  • Studied the scalability of MDS2 at a finer
    granularity using NetLogger instrumentation
  • Located performance bottlenecks and constraints
    for MDS2 GRIS and GIIS
  • Caching or pre-fetching the data is much more
    important than we expected
  • Placing primary components at well-connected
    sites can improve performance too

23
Future Work
  • Do more NetLogger-assisted experiments to address
    other features of MDS2 GRIS and GIIS
  • Study more monitoring and information services
  • Study how access control affects performance
  • Perform WAN environment experiments

24
Contact Information
  • Me
  • Xuehai Zhang, University of Chicago
  • Email
  • hai_at_cs.uchicago.edu
  • Web
  • http//people.cs.uchicago.edu/hai
  • Advisor and co-author
  • Dr. Jennifer Schopf, Argonne National Lab
  • jms_at_mcs.anl.gov
Write a Comment
User Comments (0)
About PowerShow.com