Parallel Computers - PowerPoint PPT Presentation

About This Presentation
Title:

Parallel Computers

Description:

Network of Workstations (clusters) SIMD, MIMD. SIMD Single Instruction Multiple Data ... D. J. Kerbyson, A. Hoisie, H. Wasserman. ... – PowerPoint PPT presentation

Number of Views:98
Avg rating:3.0/5.0
Slides: 26
Provided by: eecis
Category:

less

Transcript and Presenter's Notes

Title: Parallel Computers


1
Parallel Computers
  • Past and Present
  • Yenchi Lin
  • Apr 17,2003

2
Outline
  • Concepts/Background on Parallel Computers
  • Connection Machines
  • Earth Simulator
  • Conclusion

3
Quick architecture overview
  • SIMD, MIMD
  • Shared memory, distributed memory
  • MPP, PVP, SMP
  • NOW
  • Network of Workstations (clusters)

4
SIMD, MIMD
  • SIMD Single Instruction Multiple Data
  • All processors perform same instruction on
    different pieces of data
  • Some processors can be masked out from executing
    certain instructions
  • MIMD Multiple Instruction Multiple Data
  • Each processor executes different instruction on
    different data

5
Memory
  • Shared Memory
  • Single, unified address space across all
    processors
  • Distributed Memory
  • Each processor has its own address space
  • Hybrid
  • Multiple processors within a computing node share
    the same address space, while the whole system
    has many different address spaces.

6
Processors
  • PVP parallel vector processors
  • Cray, NEC, Hitachi
  • MPP massively parallel processors
  • Connection Machines
  • SMP symmetric multiple processor
  • Sun SunFire, DEC (Compaq/HP) AlphaServer

7
D.E. Culler, J.P. Singh, A. Gupta Parallel
Computer Architecture A Hardware/Software
Approach
8
Trends (cont.)
The trend of MPP overtaking SMP has continued, as
number of NOW (clusters) grow in TOP 500 list.
D.E. Culler, J.P. Singh, A. Gupta Parallel
Computer Architecture A Hardware/Software
Approach
9
Connection Machines
  • Invented by Dennis Hills of Thinking Machines
    Corp. while at MIT.
  • Originally designed to run artificial
    intelligence applications
  • First working application on CM-1 Game of Life
  • CM-1(1985), CM-2 (1986) and CM-5 (1992)
  • Richard Feynman helped in building the first
    CM-1s.
  • At its peak, 70 machines were installed around
    the world and all in TOP 500 list.
  • Thinking Machines Corp. filed bankruptcy in 1993,
    changed to pure software company in 1996, bought
    by Oracle in 1999.

10
CM-2 1986
  • SIMD
  • hypercube connection
  • 1bit processor in groups of 16.
  • 8 dimension for 8192 processor configuration, 12
    dimension for 65536 processor configuration.
  • Programming languages C, lisp, CM Fortran

11
Sprint Node in CM-2
12 degree connectivity!
  • 1 bit-serial processors
  • 16 in a group, two groups on the board
  • Two groups share same memory and floating point
    unit
  • Router has limited processing power

12
Hypercube Connection in CM-2
  • Maximum hop count in hypercube dimension of
    hypercube
  • Router randomly pick the next hop
  • High wire count

Four dimensional hypercube
13
CM-5 1992
  • Distributed memory multi-processor
  • Sparc custom vector units
  • Fat Tree structure
  • Programming Languages C, lisp, CM Fortran,
    HPF, C, etc
  • Supports partitioning, multi-user

14
Processing Element in CM-5
  • 33Mhz SPARC
  • Vector processor
  • Network interface
  • 32MB memory
  • Connected using Sun MBus
  • Network access treated equally as memory access
    expensive for larger message

15
Fat-Tree of CM-5
  • Three networks data, control and diagnostic,
    synchronized on 40Mhz clock
  • 4-ary fat tree, each processor as leaf
  • Two parents per child for the first two levels
  • Four parents per child for higher levels

Data network of CM-5
16
Transition from CM-2 to CM-5
  • 1-bit serial processors -gt 64bit SPARCs
  • SIMD -gt MIMD
  • Use SPMD to emulate SIMD behavior
  • Hypercube -gt Fat-Tree
  • Randomness preserved by random routing

17
Earth Simulator 2002
  • Collection of modified NEC SX-6
  • 640 nodes, 8 way each
  • 12.3GB/s x 2 network
  • Theoretical throughput 40TFlops
  • Max throughput 36TFlops running Linpack

18
Programming Models of ES
  • MPI/HPF on node level and process level
  • OpenMP, threads
  • Automatic Vectorization

19
Organization of ES
  • 320 processor node (PN) cabinet, 2 nodes each
  • 65 interconnect (IN) cabinet
  • Crossbar of 640 nodes
  • 12.3GB/s x 2 (bidirectional) node-to-node, 8TB/s
    aggregated
  • 900TB disk space, 1.6 PB tape storage

20
PN of ES
Arithmetic Processor (SX-6)
Memory (512MB)
21
Arithmetic Processor
Total of 640 x 8 5112 arithmetic processors
22
remarks
  • Initial Cost
  • Development 40Billion Yen (USD 400M)
  • Physical Building 7Billion Yen (USD 70M)
  • Operating cost
  • Maintenance 8Billion Yen/Year (USD 80M)
  • USD 2.54/sec
  • Electricity 800Million Yen/Year (USD 8M)

23
Eye Candies
1 AP, 9 in one cabinet
SX-6i
PN cabinet, 9APs in one
Back of a PN cabinet
24
Conclusion
  • Connection machines were interesting
  • Earth simulator is also interesting
  • Early designs versus recent design
  • GigaFlops vs. TeraFlops
  • When will Americans take back the crown in
    supercomputing?

25
references
  • Top 500.org http//www.top500.org/ORSC/
  • Earth simulator - http//www.es.jamstec.go.jp/
  • http//ails.arc.nasa.gov/Images/InfoSys/AC93-0146-
    2.html
  • http//ails.arc.nasa.gov/Images/InfoSys/AC90-0563-
    7.html
  • http//archive.ncsa.uiuc.edu/Pubs/TechReports/TR02
    3/Summary.html
  • http//www.netlib.org/benchmark/top500/reports/rep
    ort94/Architec/node32.html
  • http//mission.base.com/tamiko/cm/cm-text.htm
  • http//www.longnow.org/about/articles/ArtFeynman.h
    tml
  • D.E. Culler, J.P. Singh and A. Gupta. Parallel
    Computer Architecture A Hardware/Software
    Approach 1999
  • Hennessy, Patterson. Computer Architecture A
    Quantitative Approach, 2nd Ed. 2002
  • D. J. Kerbyson, A. Hoisie, H. Wasserman. A
    Comparison Between the Earth Simulator and
    AlphaServer Systems using Predictive Application
    Performance Models 2002
  • Thinking Machines Corp. The Network Architecture
    of the Connection Machine CM-5 1992
  • E. Blelloch, et. All. A Comparison of Sorting
    Algorithms for the Connection Machine CM-2 1991
Write a Comment
User Comments (0)
About PowerShow.com