Grid Performance Engineering - PowerPoint PPT Presentation

1 / 21

About This Presentation

Title:

Grid Performance Engineering

Description:

Background (late 80s) First started with transputers, here in Edinburgh: ... 30-ish codes - low-level, kernels, and full applications system-level scripts, ... – PowerPoint PPT presentation

Number of Views:32

Avg rating:3.0/5.0

Slides: 22

Provided by: MarkB153

Category:

more less

Transcript and Presenter's Notes

Title: Grid Performance Engineering

1
Grid Performance Engineering

Mark Baker
Distributed Systems Group
University of Portsmouth
http//dsg.port.ac.uk/

2
Contents

Background,
Observations,
Suggested areas to investigate,
Conclusions.

3
Background (late 80s)

First started with transputers, here in
Edinburgh
Oil reservoir simulations,
Parallelised part of simultior, via task farm,
OCCAM Fortran 77,
Wanted to show the benefits of parallel systems.
Early 90s a lot of work showing that the early
parallel systems gave the right order of
benefit to justify their use various
applications CFD, FEM, MD, graphics

4
Background (early 90s)

1993 became member of the group assigned to
benchmark a set of parallel machines for a UK
acquisition
TMC-CM5, KSR2, Meiko, Cray Intel,
30-ish codes - low-level, kernels, and full
applications system-level scripts,
Code in Fortran with PARMACS,
Vendors also used CMF and Cray-PVM versions,
Purpose of the exercise was to choose the best
machine to act as a general purpose UK HPC
platform.

5
Background (mid 90s

The Genesis suite got incorporated with LAPACK
into the Parkbench suite
Parkbench was organised by Tony and Jack and met
for several years.
Fortran MPI.
Later added NPB with MPI.
Another attempt to provide a standardised and
recognised set of codes for the evaluation of
parallel systems.

Worked with Roger, Tony and others on the Genesis
benchmark suite
Set of low-level, kernels apps
Fortran with PARMACS later MPI.
Purpose of the codes was to provide a
standardised and recognised set of codes for the
evaluation of parallel systems.

6
Background (late 90s)

Looking at MPI on MS Windows
Used a small number of low-level Fortran and C
codes.
Aim of this work was to understand the potential
of Windows clusters against UNIX ones.
Show performance hits and bottlenecks.
Started working on PEMCS see dsg.port.ac.uk/jour
nals/PEMCS/
Slight digression here!

7
PEMCS
8
Background later 90s

Worked with Geoffrey and colleagues on mpiJava
(MPJ) - a java MPI-like message passing
interface
Benchmarking aims were related to
Compare and contrast performance of MPJ on
different platforms.
Show that the hit of using MPJ was not so
great.

9
Background (early 2000s)

Looking at the effects of working with widely
distributed infrastructure and resources
Stressing the Jini LUS in order to understand its
capabilities and performance, wanted answers to
questions like
How many objects can be stored?
Many clients can simultaneously access the LUS?
How long does a search take?
What are the effects on system performance of the
lease-renewal cycle
Aim of recent efforts with performance testing is
understanding the capabilities and configuration
of components in a distributed environment.

10
Some Observations

Machine architectures have become increasingly
complicated
Interconnects, memory hierarchy, caching
Greater inter-dependence of different systems
components (h/w and s/w).
Performance metrics vary depending on the
stakeholders viewpoint
CPU, Disk IO, out-of-core, Graphics, Comms
No ONE benchmark (or suite of) suits everyone
in the end it depends on the stakeholder and
their application needs.
Increasing recognition as we move to a
distributed infrastructure of the need to
understand the individual components that it
consists of!

11
Some Observations

Few funded efforts to understand system
performance, most are unfunded and voluntary
efforts, Genesis, Parkbench
Often trying to compare apples with pears, such
as comparing common operations in HPF and MPI.
Lots of knowledge and expertise in the
traditional benchmarking areas, where trying to
understanding single CPU, SMPs and MPPs not
suggesting this is a solved problem area though!
The increasing popularity of wide area computing,
means we need to revisit what we mean by
performance evaluation and modelling.
Now a view of the Grid

12
Hub
Hub
Hub
Grid Enabled Site
Hub
The Grid
13
Grid-based systems

Assuming I want to run an application on the
Grid
Id be fairly happy to pick up semi-standard
benchmarks to look at the performance of a system
at a Grid site (say a cluster or SP2).
There are a bunch of tools for looking at and
analysing TCP/IP performance mainly via ICMP.
Obviously these only show past performance!
Without real QoS, performance must be a guess!
Ive no real idea of the performance or
capabilities of the software components that make
up the Grid infrastructure!
Such as agents and brokers for scheduling and
caching, or communications.

14
Grid-Systems

We need metrics and measurements that help us
understand Grid-based systems.
This will help reveal to us the factors that will
affect the way we configure and use the Grid.
From a CS perspective some key areas for further
investigation are
Inter-Grid site communications.
Information Services,
Metadata processing,
Events and Security

15
Inter-Grid sitecommunications

Communications performance simple bandwidth,
latency and jitter measurements between grid
sites.
Maybe GridFTP tests
Did something similar for the EuroPort project
back in mid 90s.
Speed and latency change on an minute by minute
basis (diurnal cycle).
Perhaps explore staging and caching!
Data can be used to predict inter-site
communications capabilities.
Performance of HTTP tunnelling protocols, maybe
via proxy servers.
SOAP Benchmarks, performance and processing.

16
Information Services

Information Service capabilities and scalability,
so we can choose best system and configuration
for deployment.
Produce a range of tests that can
Compare implementation's of the same server
Load and search small, medium and large static
info sets.
Updating dynamic data!
Serving tests
Many clients,
Max objects.
Varying access patterns,
Caching strategies,
Lots to learn from database tests here.
Compare different information servers!
UDDI, LUS, LDAP, DNS, JXTA combinations!

17
Metadata processing

Data is increasingly being described using
metadata languages.
Plethora of schemas and markup languages.
It appears parsing and using metadata efficiently
is becoming vital.
Produce a range of tests to look at the
components for using/parsing metadata
Raw bytes/sec
Marshalled/unmarshalled size
others

18
Events and Security

Event-based systems greater use of event-based
systems.
Grid Monitoring Architecture (GMA)
publisher/subscriber.
Maybe measure various aspects of the event system
architecture,
How long to subscribe?
How long to send events to multiple subscribers
Recognise the need for lightweight and efficient
events services.
Security infrastructure can have an intrusive
impact on overall system performance.
Effects of
Firewall potential bottleneck!
SSL socket creations and handshaking,
Token processing, other aspects

19
Conclusions

Need various aspects of performance evaluation
and modelling for a variety of uses depending
on the stakeholders, e.g.
Proof of concepts algorithms, paradigms
Compare hardware and software architectures,
Optimise applications
In a Grid application performance has three broad
areas of concern
System capabilities,
Network,
Software infrastructure.
First two points come under traditional
benchmarking, which is fairly well established
and understood (CPU and network) maybe
debatable!

20
Areas of future Interest

To understand the performance of a Grid we could
use statistical methods, gather historical data
that can be used to predict performance of Grid
applications
In a gross sense this is OK, but fails to address
the fact that we are using a dynamically changing
infrastructure and need to incorporate new
components!
From a CS perspective it is evident that we need
to also understand other aspects of distributed
environments, including
Inter-Grid site communications.
Information Services,
Metadata processing,
Events and Security