Grid Performance Engineering - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Grid Performance Engineering

Description:

Background (late 80s) First started with transputers, here in Edinburgh: ... 30-ish codes - low-level, kernels, and full applications system-level scripts, ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 22
Provided by: MarkB153
Category:

less

Transcript and Presenter's Notes

Title: Grid Performance Engineering


1
Grid Performance Engineering
  • Mark Baker
  • Distributed Systems Group
  • University of Portsmouth
  • http//dsg.port.ac.uk/

2
Contents
  • Background,
  • Observations,
  • Suggested areas to investigate,
  • Conclusions.

3
Background (late 80s)
  • First started with transputers, here in
    Edinburgh
  • Oil reservoir simulations,
  • Parallelised part of simultior, via task farm,
  • OCCAM Fortran 77,
  • Wanted to show the benefits of parallel systems.
  • Early 90s a lot of work showing that the early
    parallel systems gave the right order of
    benefit to justify their use various
    applications CFD, FEM, MD, graphics

4
Background (early 90s)
  • 1993 became member of the group assigned to
    benchmark a set of parallel machines for a UK
    acquisition
  • TMC-CM5, KSR2, Meiko, Cray Intel,
  • 30-ish codes - low-level, kernels, and full
    applications system-level scripts,
  • Code in Fortran with PARMACS,
  • Vendors also used CMF and Cray-PVM versions,
  • Purpose of the exercise was to choose the best
    machine to act as a general purpose UK HPC
    platform.

5
Background (mid 90s
  • The Genesis suite got incorporated with LAPACK
    into the Parkbench suite
  • Parkbench was organised by Tony and Jack and met
    for several years.
  • Fortran MPI.
  • Later added NPB with MPI.
  • Another attempt to provide a standardised and
    recognised set of codes for the evaluation of
    parallel systems.
  • Worked with Roger, Tony and others on the Genesis
    benchmark suite
  • Set of low-level, kernels apps
  • Fortran with PARMACS later MPI.
  • Purpose of the codes was to provide a
    standardised and recognised set of codes for the
    evaluation of parallel systems.

6
Background (late 90s)
  • Looking at MPI on MS Windows
  • Used a small number of low-level Fortran and C
    codes.
  • Aim of this work was to understand the potential
    of Windows clusters against UNIX ones.
  • Show performance hits and bottlenecks.
  • Started working on PEMCS see dsg.port.ac.uk/jour
    nals/PEMCS/
  • Slight digression here!

7
PEMCS
8
Background later 90s
  • Worked with Geoffrey and colleagues on mpiJava
    (MPJ) - a java MPI-like message passing
    interface
  • Benchmarking aims were related to
  • Compare and contrast performance of MPJ on
    different platforms.
  • Show that the hit of using MPJ was not so
    great.

9
Background (early 2000s)
  • Looking at the effects of working with widely
    distributed infrastructure and resources
  • Stressing the Jini LUS in order to understand its
    capabilities and performance, wanted answers to
    questions like
  • How many objects can be stored?
  • Many clients can simultaneously access the LUS?
  • How long does a search take?
  • What are the effects on system performance of the
    lease-renewal cycle
  • Aim of recent efforts with performance testing is
    understanding the capabilities and configuration
    of components in a distributed environment.

10
Some Observations
  • Machine architectures have become increasingly
    complicated
  • Interconnects, memory hierarchy, caching
  • Greater inter-dependence of different systems
    components (h/w and s/w).
  • Performance metrics vary depending on the
    stakeholders viewpoint
  • CPU, Disk IO, out-of-core, Graphics, Comms
  • No ONE benchmark (or suite of) suits everyone
    in the end it depends on the stakeholder and
    their application needs.
  • Increasing recognition as we move to a
    distributed infrastructure of the need to
    understand the individual components that it
    consists of!

11
Some Observations
  • Few funded efforts to understand system
    performance, most are unfunded and voluntary
    efforts, Genesis, Parkbench
  • Often trying to compare apples with pears, such
    as comparing common operations in HPF and MPI.
  • Lots of knowledge and expertise in the
    traditional benchmarking areas, where trying to
    understanding single CPU, SMPs and MPPs not
    suggesting this is a solved problem area though!
  • The increasing popularity of wide area computing,
    means we need to revisit what we mean by
    performance evaluation and modelling.
  • Now a view of the Grid

12
Hub
Hub
Hub
Grid Enabled Site
Hub
The Grid
13
Grid-based systems
  • Assuming I want to run an application on the
    Grid
  • Id be fairly happy to pick up semi-standard
    benchmarks to look at the performance of a system
    at a Grid site (say a cluster or SP2).
  • There are a bunch of tools for looking at and
    analysing TCP/IP performance mainly via ICMP.
  • Obviously these only show past performance!
  • Without real QoS, performance must be a guess!
  • Ive no real idea of the performance or
    capabilities of the software components that make
    up the Grid infrastructure!
  • Such as agents and brokers for scheduling and
    caching, or communications.

14
Grid-Systems
  • We need metrics and measurements that help us
    understand Grid-based systems.
  • This will help reveal to us the factors that will
    affect the way we configure and use the Grid.
  • From a CS perspective some key areas for further
    investigation are
  • Inter-Grid site communications.
  • Information Services,
  • Metadata processing,
  • Events and Security

15
Inter-Grid sitecommunications
  • Communications performance simple bandwidth,
    latency and jitter measurements between grid
    sites.
  • Maybe GridFTP tests
  • Did something similar for the EuroPort project
    back in mid 90s.
  • Speed and latency change on an minute by minute
    basis (diurnal cycle).
  • Perhaps explore staging and caching!
  • Data can be used to predict inter-site
    communications capabilities.
  • Performance of HTTP tunnelling protocols, maybe
    via proxy servers.
  • SOAP Benchmarks, performance and processing.

16
Information Services
  • Information Service capabilities and scalability,
    so we can choose best system and configuration
    for deployment.
  • Produce a range of tests that can
  • Compare implementation's of the same server
  • Load and search small, medium and large static
    info sets.
  • Updating dynamic data!
  • Serving tests
  • Many clients,
  • Max objects.
  • Varying access patterns,
  • Caching strategies,
  • Lots to learn from database tests here.
  • Compare different information servers!
  • UDDI, LUS, LDAP, DNS, JXTA combinations!

17
Metadata processing
  • Data is increasingly being described using
    metadata languages.
  • Plethora of schemas and markup languages.
  • It appears parsing and using metadata efficiently
    is becoming vital.
  • Produce a range of tests to look at the
    components for using/parsing metadata
  • Raw bytes/sec
  • Marshalled/unmarshalled size
  • others

18
Events and Security
  • Event-based systems greater use of event-based
    systems.
  • Grid Monitoring Architecture (GMA)
    publisher/subscriber.
  • Maybe measure various aspects of the event system
    architecture,
  • How long to subscribe?
  • How long to send events to multiple subscribers
  • Recognise the need for lightweight and efficient
    events services.
  • Security infrastructure can have an intrusive
    impact on overall system performance.
  • Effects of
  • Firewall potential bottleneck!
  • SSL socket creations and handshaking,
  • Token processing, other aspects

19
Conclusions
  • Need various aspects of performance evaluation
    and modelling for a variety of uses depending
    on the stakeholders, e.g.
  • Proof of concepts algorithms, paradigms
  • Compare hardware and software architectures,
  • Optimise applications
  • In a Grid application performance has three broad
    areas of concern
  • System capabilities,
  • Network,
  • Software infrastructure.
  • First two points come under traditional
    benchmarking, which is fairly well established
    and understood (CPU and network) maybe
    debatable!

20
Areas of future Interest
  • To understand the performance of a Grid we could
    use statistical methods, gather historical data
    that can be used to predict performance of Grid
    applications
  • In a gross sense this is OK, but fails to address
    the fact that we are using a dynamically changing
    infrastructure and need to incorporate new
    components!
  • From a CS perspective it is evident that we need
    to also understand other aspects of distributed
    environments, including
  • Inter-Grid site communications.
  • Information Services,
  • Metadata processing,
  • Events and Security

21
Questions?
Write a Comment
User Comments (0)
About PowerShow.com