Commodity Computing Clusters - next generation supercomputers? - PowerPoint PPT Presentation

About This Presentation
Title:

Commodity Computing Clusters - next generation supercomputers?

Description:

A supercomputer is a computer system that leads the world in terms of processing ... Supercomputer Classes (2) Special-purpose supercomputers - high performance ... – PowerPoint PPT presentation

Number of Views:196
Avg rating:3.0/5.0
Slides: 29
Provided by: pawelpi
Category:

less

Transcript and Presenter's Notes

Title: Commodity Computing Clusters - next generation supercomputers?


1
Commodity Computing Clusters - next generation
supercomputers?
  • Pawel Pisarczyk, ATM S. A.
  • pawel.pisarczyk_at_atm.com.pl

2
Agenda
  • Introduction
  • Supercomputer classification
  • Architecture and implementations
  • Commodity clusters
  • Processors
  • Operating systems
  • Summary

3
Supercomputer
  • A supercomputer is a device for turning
    compute-bound problems into I/O-bound problem -
    Seymour Cray
  • A supercomputer is a computer system that leads
    the world in terms of processing capacity,
    particularly speed of calculations, at the time
    of its introduction.
  • source http//en.wikipedia.org

4
Supercomputer History (1)
  • 1945-50 - Manchester Mark I
  • 1950-55 - MIT Whirlwind
  • 1955-60 - IBM 7090 - 210 KFLOPS
  • 1960-65 - CDC 6600 -10.24 MFLOPS
  • 1965-70 - CDC 7600 - 32.27 MFLOPS
  • 1970-75 - CDC Cyber 76

5
Supercomputer History (2)
  • 1975-80 - Cray-1 - 160 MFLOPS
  • 1980-85 - Cray X-MP - 500 MFLOPS
  • 1985-90 - Cray Y-MP - 1.3 GFLOPS
  • 1990-95 - Fujitsu Numerical Wind Tunnel - 236
    GFLOPS
  • 1995-00 - Intel ASCI Red - 2.150 TFLOPS
  • 2000-02 - IBM ASCI White, SP Power3 375 MHz -
    7.226 TFLOPS
  • 2002-03 - NEC Earth Simulator - 35 TFLOPS

6
Supercomputer Classes (1)
  • General-purpose supercomputers
  • vector processing machines - the same operation
    carried out on a large amount of data
    simultaneously
  • tightly connected cluster computers (NUMA) -
    communication oriented architectures engineered
    from ground up, based on high speed interconnects
    and large number of processors
  • commodity clusters - collection of large number
    of commodity PCs (COTS) interconnected by
    high-bandwidth low-latency network

7
Supercomputer Classes (2)
  • Special-purpose supercomputers - high performance
    computing devices with a hardware architecture
    dedicated to solve a single problem (equipped
    with custom ASICS or FPGA chips)
  • Examples
  • Deep Blue
  • GRAPE for astrophysics

8
Flynn taxonomy - 1972 (1)
  • SISD - Single Instruction Single Data (DEC, Sun
    Microsystems, PC)
  • SIMD - Single Instruction Multiple Data
  • computers with large number o processing units
    (i.e. ALUs) - CPP DAP Gamma II, Quadrics Apemille
  • vector processing machines - NEC SX6, IA32 MMX
  • MISD - Multiple Instruction Single Data
  • theoretical model, no practical implementation

9
Flynn taxonomy - 1972 (2)
  • MIMD - Multiple Instruction Multiple Data
  • SM-MIMD - Shared Memory MIMD
  • global address space
  • SMP systems and ccNUMA systems
  • DM-MIMD - Distributed Memory MIMD
  • many nodes with local address spaces
  • high-bandwidth, low-latency communication
  • common NUMA architectures (Non Uniform Memory
    Access)
  • operating system have to be communication
    oriented (Mach project)

10
SM-MIMD implementations
  • S-COMA - Simple Cache-Only Memory Architecture
  • common SMP systems
  • ccNUMA - Cache Coherent NUMA
  • SGI Origin 3000
  • SGI Altix 3000
  • HP SuperDome

11
S-COMA (SMP)
RAM
L2 cache
L2 cache
L2 cache
CPU 0
CPU 1
CPU N
12
ccNUMA
RAM 0
L3 cache
L2 cache
L2 cache
CPU 0
CPU 1
13
ccNUMA implementation
  • SGI Altix 3000 (ccNUMA)
  • 64 Itanium 2 (IA64) processors
  • C-brick modules with 2 CPUs and ASIC SHUB
  • NUMAflex, NUMAlink interconnects (6.4 GB/s, 2.4
    GB/s)
  • Modified Linux kernel (2.6 NUMA support)

14
DM-MIMD implementations
  • Massively parallel systems (NUMA)
  • communication oriented architecture
  • low-latency, high-bandwidth interconnects
  • topologies hypercube, torus, tree
  • Butterfly networks, Omega networks, engineered
    from ground up communication

15
DM-MIMD implementations
  • Commodity clusters
  • a cluster is a collection of connected,
    independent computers working in unison to solve
    a problem
  • COTS technology
  • nodes are interconnected by Ethernet LAN,
    Myrinet, QsNet ELAN etc.
  • computation can be performed by using popular
    programming toolkits and frameworks OpenMP, MPI
  • clusters require dedicated management software

16
NUMA implementations
  • Cray T3E-1350
  • Processor Alpha 21164 675 MHz
  • Number of CPUs 40 - 2176
  • 3-D Torus topology
  • Operating system UNICOS/mk - microkernel based
  • Peak performance 3 TFLOPS

17
Commodity cluster implementation (1)
  • Linux Networx/Quadrics
  • Processor Intel Xeon 2.4 GHz
  • CPUs 2304
  • Interconnections QsNet ELAN3
  • Operating system Linux management tools
    Lustre Cluster File System
  • Peak performance 7.6 TFLOPS
  • 3rd computer on TOP500 list
  • Developed for Lawrence Livermore National
    Laboratory in 2002

18
Commodity cluster implementation (2)
  • HP XC6000 Cluster (XC3000 Cluster)
  • Processor Intel Itanium 2 6M 1.5 GHz (Intel Xeon
    3 GHz)
  • Node HP Integrity rx2600 (HP ProLiant DL380)
  • Number of processors 34-512
  • Interconnections QsNet ELAN3 (Myricom Myrinet
    XP)
  • Operating system Linux SSI Middleware
    management tools Lustre Cluster File System
  • Peak performance 34 CPUs - 204 GFLOPS, 512 CPUs
    - 3 TFLOPS

19
Commodity Clusters - software
  • Operating system - Linux or SSI Linux (Single
    System Image)
  • Platform for specialized applications for
    science, engineering and business (simulation,
    modeling, data mining)
  • Distributed computation environments are used for
    software development (OpenMP, MPI)
  • Common supercomputer applications require porting
    to clusters

20
Performance Scaling
Scale Right
Scale-Up (SMP, ccNUMA)
Scale-Out (Cluster)
21
Processors (1)
  • Many types of existing processors are used in
    supercomputers
  • Microprocessor development directions
  • Increasing of clock frequency and speed
    instruction stream processing
  • Processing of large collection of data in single
    processor instruction - SIMD
  • Control path multiplication multithreading

22
Processors (2)
  • Vector processors
  • NEC SX-6
  • Cray (Cray X1)
  • RISC processors
  • MIPS
  • IBM Power4
  • Alpha
  • CISC processors
  • IA32
  • AMD x86-64
  • VLIW processors
  • IA64

23
Intel Itanium 2 features
  • State-of-the-art unconventional 64-bit
    architecture
  • New programming model implementing VLIW paradigm
  • EPIC technology Explicitly Parallel Instruction
    Computing compiler determines instruction
    dependency informing processor how to process an
    instruction stream parallel
  • Many registers (128 64-bit), register stack
    management
  • 6 GFLOPS peak performance
  • Full advantages of the processor can be used by
    dedicated compiler

24
Operating systems
  • Monolithic kernel based OSs - UNIX (modification
    of existing solutions)
  • BSD
  • Solaris
  • Irix
  • Linux
  • Microkernel based OSs
  • Mach

25
Microkernel architecture
Task A
Task B
Task C
Kernel
Kernel
Hardware
Hardware
26
Summary
  • Todays there is a lot of supercomputer
    architectures
  • Both vector processors and common RISC, CISC,
    VLIW chips are used for supercomputers
  • Commodity clusters under control of Linux OS are
    an attractive method for supercomputer
    implementation

27
TOP 500 list (1)
1. Earth Simulator, NEC - 35.86 TFLOPS
2. HP Alphaserver SC, HP - 13.88 TFLOPS
3. Linux Networx / Quadrics IA32 - 7.634 TFLOPS
28
Top 500 list (2)
Write a Comment
User Comments (0)
About PowerShow.com