Performance Comparison of Grid Information Services - PowerPoint PPT Presentation

About This Presentation
Title:

Performance Comparison of Grid Information Services

Description:

Evaluation of three databases: relational (mySQL), LDAP (openLDAP), and XML (Xindice) ... (mySQL) LDAP (open. LDAP) transform. into schema. for. start. GLUE v8 ... – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 20
Provided by: beth167
Category:

less

Transcript and Presenter's Notes

Title: Performance Comparison of Grid Information Services


1
Performance Comparison of Grid Information
Services
  • Beth Plale
  • Computer Science Dept.
  • Indiana University
  • Unified Relational GIS Project
  • Collaborative project with
  • Peter Dinda, Northwestern University

2
Schemas in performance evaluation influenced by
  • Key Concepts and Services of a Grid Information
    Service, Beth Plale, Peter Dinda, Gregor von
    Laszewski, IASTED Parallel and Distributed
    Computing Systems (PDCS), September 2002

3
Types of Resource Information
Grid Entity Description
Organizations Accountable bodies and owners of resources
People Resource admins, resource providers, GIS admins
Physical resources Compute resources, network interfaces, benchmark results, number of users, load
Services Job manager, load leveler, other GIS
Comm resources Link capacity, switch capacity, error rate, drop rate
Software packages BLAS, LAPACK, etc.
Event producers Generators of event streams
Event channels Event stream propagation vehicle
Event dictionaries List of commonly used event types
Instruments Radar systems, telescopes, etc.
Network paths Available bandwidth and expected latency
Network topologies Hosts, switches, routers
Wireless devices Wireless hosts, wavepoints, cells, etc.
Virtual organizations Groups of collaborators
4
Criteria for Inclusion in GIS
  • Defn object in repository represents entity in
    real-world grid
  • Grid entity has representation in GIS repository
    if grid entity
  • can be described
  • has value to more than one application
  • has persistency needs beyond single application
    run

5
Services Provided by GIS
  • Query interface request for information through
    query language
  • e.g., SELECT FROM WHERE in SQL
  • Update interface request to add/update
    information in repository
  • e.g., UPDATE in SQL
  • Management interface activation, deactivation of
    service

6
Additional GIS Functionality
  • Replication
  • Provision of replica transparency
  • Distribution (a grid-driven necessity)
  • Partitioning of information across sites.
  • Security interface
  • Object level or column level?
  • Access control

7
GCE testbed portal
View of GIS service Interoperability
1.
Xpath query
XML doc
GCE testbed XML schema
Xpath query
Xpath query
2.
XML doc
converter
SQL query
3.
LDAP query
XML db
mySQL
LDAP
Xindice
8
Benchmark Evaluation of Alternate GIS
Representations
  • Evaluation of three databases relational
    (mySQL), LDAP (openLDAP), and XML (Xindice)
  • Database schemas derived from single ER diagram
    and based partly on GLUE v8
  • Benchmark set of query and update use cases
    derived from Grid job submission.
  • Cost metric minimized query response times,
    minimized update times, and minimized size of
    resulting query set.

9
Benchmark Evaluation Assumptions
  • Grid entities have complex relationships.
  • The questions asked of GIS data are becoming more
    complex.
  • Some entities require extremely rapid update
    rates.
  • Thus a cost metric that considers multiple
    aspects
  • Minimized query response times,
  • Minimized update times, and
  • Minimized size of resulting query set.

10
Benchmark Evaluation
GCE XML
GLUE v8
E-R diagram
input schemas
represent as
transform into schema for
relat- ional (mySQL)
LDAP (open LDAP)
Grid GIS Benchmark Use Cases
XML (Xindice)
evaluate against
populate by
GCE job submission use cases
scripts and existing data
11
Set I 05-02, large multi-site project
Set II 01-02, large academic HPC site
Object classes Classes w/ instances Object instances
30 10 242
Object classes Classes w/ instances Object instances
19 5 106
Top 5 classes -- MDSDevice -- HostInfo --
MDSDeviceGroup -- top -- MDSSoftware
Top 5 classes -- Globus Queue --
GlobusServicesJobMgr -- GlobusNetworkInterface --
GlobusPhysicalResource -- GlobusDaemon
36.5 24.5 13.5 8.5 7.0 ------- 90.0
42.0 26.0 17.5 8.0
6.0 ------- 100.0
Top 5 classes -- GlobusFileInstance --
GlobusQueueEntry -- GlobusQueue --
GlobusOrganization -- GlobusServiceJobManager
Set III 11-00, DOE site
80.0 6.5 3.2 1.8 1.8 ------- 94.5
Object classes Classes w/ instances Object instances
31 19 17531
12
E-R Diagram
computing elements
users
application sources
network cards
has
has
clusters
instan from
use
user accounts
has
has
end points
applications
subclusters
network benchmarks
run on
host, port, protocol
has
nodes
has
is-a
is-a
end-to-end connections
hosts (compute nodes)
network nodes
network paths
traceroute packet loss, latency.roundtripDelay.pin
g, bandwidth.avail.TCP.singleStream
GLUE v8
13
Relational (table) representation
computing elements
users
application sources
network cards
clusters
user accounts
applications
end points
subclusters
host, port, protocol
end-to-end connections
traceroute packet loss, latency.roundtripDelay.pin
g, bandwidth.avail.TCP.singleStream
14
Hierarchical representation
EDTtop
network nodes
compute elements
user
network path
clusters
application sources
connections
user accounts
application
subclusters
hosts (compute nodes)
endpoints
15
Benchmark set of Use Cases of GIS query and
update
  • Use cases based on job submission.
  • examples drawn from HotPage (M. Thomas)
  • Query 1 Suppose user is part of NPACI
    organization and knows his/her binary runs better
    on T3E.
  • Of machines in NPACI organization, give me list
    of T3Es and their location for which availability
    is good, a binary is resident, and I have an
    account.

16
Return machines and locations
SELECT C.CPUmodel, C.name, C.location FROM
Cluster as C, SubCluster as SC, Host as H,
Application as A, UserAccount as UA, User as
U WHERE C.Organization NPACI and
SC.OwningCluster C.ClusterName and
SC.CPUModel T3E and A.OSName SC.OSName
and A.Owner Jane Lee and A.Location
C.Location For All H where H.OwningCluster
C.ClusterName avg(H.SMPLoad1minX100 lt 0.50)
C.ClusterUniqueID UA.ID and UA.ID U.ID
and U.Name Jane Lee and UA.ExpireDate gt
21-July-2002 and UA.ActivateDate lt
21-July-2002
Cluster is NPACI and user has binary on machine
Availability is good
User has valid account on cluster
-gt GLUEv8
17
  • Of machines in NPACI organization, give me list
    of T3Es and their location for which availability
    is good, a binary is resident, and I have an
    account.
  • availability is good could be defined
    different
  • -- Defined here as average load over all nodes
    in a SMP is less than .50.
  • -- More difficult is existence of 20 contiguous
    nodes.
  • Binary is resident is fairly easy, binary is
    nearby is a harder question to answer.
  • Show histographic usage of my job or show
    historical usage of machine X for task Y where Y
    is job submission or transfer rate to HPSS

18
(No Transcript)
19
Benchmark Evaluation
GCE XML
GLUE v8
E-R diagram
input schemas
relat- ional (mySQL)
LDAP (open LDAP)
Grid GIS Benchmark Use Cases
XML (Xindice)
GCE job submission use cases
scripts and existing data
http//www.cs.indiana.edu/plale
Write a Comment
User Comments (0)
About PowerShow.com