Title: Parallel Computers
1Parallel Computers
- Past and Present
- Yenchi Lin
- Apr 17,2003
2Outline
- Concepts/Background on Parallel Computers
- Connection Machines
- Earth Simulator
- Conclusion
3Quick architecture overview
- SIMD, MIMD
- Shared memory, distributed memory
- MPP, PVP, SMP
- NOW
- Network of Workstations (clusters)
4SIMD, MIMD
- SIMD Single Instruction Multiple Data
- All processors perform same instruction on
different pieces of data - Some processors can be masked out from executing
certain instructions - MIMD Multiple Instruction Multiple Data
- Each processor executes different instruction on
different data
5Memory
- Shared Memory
- Single, unified address space across all
processors - Distributed Memory
- Each processor has its own address space
- Hybrid
- Multiple processors within a computing node share
the same address space, while the whole system
has many different address spaces.
6Processors
- PVP parallel vector processors
- Cray, NEC, Hitachi
- MPP massively parallel processors
- Connection Machines
- SMP symmetric multiple processor
- Sun SunFire, DEC (Compaq/HP) AlphaServer
7D.E. Culler, J.P. Singh, A. Gupta Parallel
Computer Architecture A Hardware/Software
Approach
8Trends (cont.)
The trend of MPP overtaking SMP has continued, as
number of NOW (clusters) grow in TOP 500 list.
D.E. Culler, J.P. Singh, A. Gupta Parallel
Computer Architecture A Hardware/Software
Approach
9Connection Machines
- Invented by Dennis Hills of Thinking Machines
Corp. while at MIT. - Originally designed to run artificial
intelligence applications - First working application on CM-1 Game of Life
- CM-1(1985), CM-2 (1986) and CM-5 (1992)
- Richard Feynman helped in building the first
CM-1s. - At its peak, 70 machines were installed around
the world and all in TOP 500 list. - Thinking Machines Corp. filed bankruptcy in 1993,
changed to pure software company in 1996, bought
by Oracle in 1999.
10CM-2 1986
- SIMD
- hypercube connection
- 1bit processor in groups of 16.
- 8 dimension for 8192 processor configuration, 12
dimension for 65536 processor configuration. - Programming languages C, lisp, CM Fortran
11Sprint Node in CM-2
12 degree connectivity!
- 1 bit-serial processors
- 16 in a group, two groups on the board
- Two groups share same memory and floating point
unit - Router has limited processing power
12Hypercube Connection in CM-2
- Maximum hop count in hypercube dimension of
hypercube - Router randomly pick the next hop
- High wire count
Four dimensional hypercube
13CM-5 1992
- Distributed memory multi-processor
- Sparc custom vector units
- Fat Tree structure
- Programming Languages C, lisp, CM Fortran,
HPF, C, etc - Supports partitioning, multi-user
14Processing Element in CM-5
- 33Mhz SPARC
- Vector processor
- Network interface
- 32MB memory
- Connected using Sun MBus
- Network access treated equally as memory access
expensive for larger message
15Fat-Tree of CM-5
- Three networks data, control and diagnostic,
synchronized on 40Mhz clock - 4-ary fat tree, each processor as leaf
- Two parents per child for the first two levels
- Four parents per child for higher levels
Data network of CM-5
16Transition from CM-2 to CM-5
- 1-bit serial processors -gt 64bit SPARCs
- SIMD -gt MIMD
- Use SPMD to emulate SIMD behavior
- Hypercube -gt Fat-Tree
- Randomness preserved by random routing
17Earth Simulator 2002
- Collection of modified NEC SX-6
- 640 nodes, 8 way each
- 12.3GB/s x 2 network
- Theoretical throughput 40TFlops
- Max throughput 36TFlops running Linpack
18Programming Models of ES
- MPI/HPF on node level and process level
- OpenMP, threads
- Automatic Vectorization
19Organization of ES
- 320 processor node (PN) cabinet, 2 nodes each
- 65 interconnect (IN) cabinet
- Crossbar of 640 nodes
- 12.3GB/s x 2 (bidirectional) node-to-node, 8TB/s
aggregated - 900TB disk space, 1.6 PB tape storage
20PN of ES
Arithmetic Processor (SX-6)
Memory (512MB)
21Arithmetic Processor
Total of 640 x 8 5112 arithmetic processors
22remarks
- Initial Cost
- Development 40Billion Yen (USD 400M)
- Physical Building 7Billion Yen (USD 70M)
- Operating cost
- Maintenance 8Billion Yen/Year (USD 80M)
- USD 2.54/sec
- Electricity 800Million Yen/Year (USD 8M)
23Eye Candies
1 AP, 9 in one cabinet
SX-6i
PN cabinet, 9APs in one
Back of a PN cabinet
24Conclusion
- Connection machines were interesting
- Earth simulator is also interesting
- Early designs versus recent design
- GigaFlops vs. TeraFlops
- When will Americans take back the crown in
supercomputing?
25references
- Top 500.org http//www.top500.org/ORSC/
- Earth simulator - http//www.es.jamstec.go.jp/
- http//ails.arc.nasa.gov/Images/InfoSys/AC93-0146-
2.html - http//ails.arc.nasa.gov/Images/InfoSys/AC90-0563-
7.html - http//archive.ncsa.uiuc.edu/Pubs/TechReports/TR02
3/Summary.html - http//www.netlib.org/benchmark/top500/reports/rep
ort94/Architec/node32.html - http//mission.base.com/tamiko/cm/cm-text.htm
- http//www.longnow.org/about/articles/ArtFeynman.h
tml - D.E. Culler, J.P. Singh and A. Gupta. Parallel
Computer Architecture A Hardware/Software
Approach 1999 - Hennessy, Patterson. Computer Architecture A
Quantitative Approach, 2nd Ed. 2002 - D. J. Kerbyson, A. Hoisie, H. Wasserman. A
Comparison Between the Earth Simulator and
AlphaServer Systems using Predictive Application
Performance Models 2002 - Thinking Machines Corp. The Network Architecture
of the Connection Machine CM-5 1992 - E. Blelloch, et. All. A Comparison of Sorting
Algorithms for the Connection Machine CM-2 1991