Parallel computer architecture overview - PowerPoint PPT Presentation

1 / 18

About This Presentation

Title:

Parallel computer architecture overview

Description:

Number of Views:68

Avg rating:3.0/5.0

Slides: 19

Provided by: Surf6

Category:

more less

Transcript and Presenter's Notes

Title: Parallel computer architecture overview

1
Parallel computer architecture overview
2

Parallel computers definition A collection of
processing elements that cooperate to solve
large problems fast.
Some broad issues that distinguish parallel
computers
Resources
how large a collection?
how powerful are the elements?
how much memory?
Data access, communication and synchronization
how do the elements cooperate and communicate?
how are data transmitted between processors?
what are the abstractions and primitives for
cooperation?
Performance and scalability
how does it all translate into performance?
how does it scale?

3
Trend in parallel computer architecture
development

History diverse and innovative organizational
structures, often tied to novel programming
models
The architecture is often built around one or two
good ideas in software or hardware.
Rapidly matured under strong technological
constraints
The microprocessor is ubiquitous
Laptops and supercomputers are fundamentally
similar!
Technological trends cause diverse approaches to
converge
Technological trends make parallel computing
inevitable
Mainstream computing
Need to understand fundamental principles and
design tradeoffs, not just taxonomies

4
Technology trend

In terms of performance improvement, nothing
beats micro-processors.
To maintain the improvement, more and more
supercomputer features are built in
micro-processors.
Use commodity micro-processors to build
everything (if you cant beat them, join them).
Mainframes and minicomputers pretty much
disappear in todays world, replaced by server
farms (clusters of servers).
Virtualization on clusters.
Many supercomputers are clusters of
servers/workstations (see www.top500.org).

6
Parallel architectures

7
Shared memory architectures

8
UMA Shared memory architecture (mostly bus-based
MPs)

Micro on a chip makes it natural to connect many
to shared memory
dominates server and enterprise market, moving
down to desktop
Faster processors began to saturate bus, then
bus technology advanced
today, range of sizes for bus-based systems,
desktop to large servers (Symmetric
Multiprocessor (SMP) machines).

9
Bus bandwidth in Intel systems
10
NUMA Shared memory architecture

Identical processors, processors have different
time for accessing different part of the memory.
Often made by physically linking SMP machines
(Origin 2000, up to 512 processors).
The next generation SMP interconnects (Intel
Common System interface (CSI) and AMD
hypertransport) have this flavor, but the
processors are close to each other.

11
Cache coherence issue in shared memory
architecture

12
Shared memory architecture advantages and
disadvantages

Advantages
Globally shared memory provides user-friendly
programming perspective to programmers.
Disadvantage
Lack of scalability
No hope for UMA
What about NUMA
A lot of small traffic through the interconnect
adding processors changes the traffic requirement
of the Interconnect.
Writing correct shared memory parallel programs
is not straight forward.

13
Distributed memory architectures

Processors have their own local memory. Memory
addresses in one processor do not map to another
processor.
no concept of global address space.
No concept of cache coherency.
To access data in another processor, use explicit
communication.

14
Distributed memory architectures

The networks can be very different for
distributed memory architectures
Massively parallel processors (MPP) usually use
a specially designed network (and node).
IBM Bluegene, IBM SP series
Clusters usually use commodity system/local area
networks Infiniband, Quadrics, Myrinet, 10 Gbps
Ethernet.
Lemieux at PSC uses Quadrics
Ranger (NO. 2 top supercomputer) at TACC uses
Infiniband
UC-TG at Argonne uses Myrinet
The raw speed of the network matches that of the
specially designed network.
May not provide some customized support such as
reduction network.
Grid computers use the Internet as the networks.

15
Distributed memory architectures

MPP, clusters and grid computers targets
different types of applications
MPP and clusters support tightly coupled
applications (large amount of interactions among
processes).
Communicate every 1 microsecond.
Grid computers can only support coarse-grain
parallel applications or embarrassingly parallel
applications.
Communicate every second.

16
Advantages and disadvantages

Advantages
Memory is scalable with number of processors.
Increase the number of processors and the size of
memory increases proportionately.
Each processor can rapidly access its own memory
without interference and without the overhead
incurred with trying to maintain cache coherency.
Cost effectiveness can use commodity,
off-the-shelf processors and networking
Disadvantages
The programmer is responsible for the details
associated with data communication.
It may be difficult to map existing data
structures, based on global memory, to this
memory organization.