Cluster Basics - PowerPoint PPT Presentation

1 / 48
About This Presentation
Title:

Cluster Basics

Description:

Constructed with many compute nodes and often a high-performance interconnect ... 'landed on motherboard' Entry-level cards. Infiniband - High Performance ... – PowerPoint PPT presentation

Number of Views:436
Avg rating:3.0/5.0
Slides: 49
Provided by: gregb61
Category:

less

Transcript and Presenter's Notes

Title: Cluster Basics


1
Cluster Basics
2
Moores Law
3
Cluster Pioneers
  • In the mid-1990s, Network of Workstations project
    (UC Berkeley) and the Beowulf Project (NASA)
    asked the question

Can You Build a High Performance Machine
From Commodity Components?
4
High Performance Cluster
  • Constructed with many compute nodes and often a
    high-performance interconnect

5
The Answer is Yes
6
Types of Clusters
  • High Availability
  • Generally small (less than 8 nodes)
  • Visualization
  • High Performance
  • Computational tools for scientific computing
  • Large database machines

7
High Availability Cluster
  • Composed of redundant components and multiple
    communication paths

8
Visualization Cluster
  • Each node in the cluster drives a display

9
Cluster Hardware Components
10
Common Cluster Processors
  • Pentium/Athlon
  • Opteron /EM64T
  • Itanium
  • PowerPC

11
SPEC Benchmark
12
Processors
Itanium 2
PowerPC 970
Itanium 2
PowerPC 970
Pentium 4
13
Interconnects
14
Interconnects
  • Ethernet
  • Most prevalent on clusters
  • Low-latency interconnects
  • Myrinet
  • Infiniband
  • Quadrics

15
Why Low-Latency Interconnects?
  • Performance
  • Lower latency
  • Higher bandwidth
  • Accomplished through OS-bypass

16
Why Low Latency Interconnects?
Infiniband
Myrinet
Quadrics
Infiniband
Myrinet
17
How Low Latency Interconnects Work
  • Decrease latency for a packet by reducing the
    number memory copies per packet

18
Myrinet
  • Long-time interconnect vendor
  • Delivering products since 1995
  • Deliver single 256-port full bisection bandwidth
    switch
  • MPI Performance
  • Latency 6.3 us
  • Bandwidth 245 MB/s
  • Cost/port (based on 64-port configuration) 1000
  • Switch NIC cable
  • http//www.myri.com/myrinet/product_list.html

19
Quadrics
  • QsNetII E-series
  • Deliver 128-port standalone switches
  • MPI Performance
  • Latency 3 us
  • Bandwidth 900 MB/s
  • Cost/port (based on 64-port configuration) 1800
  • Switch NIC cable
  • http//doc.quadrics.com/Quadrics/QuadricsHome.nsf/
    DisplayPages/A3EE4AED738B6E2480256DD30057B227

20
Infiniband
  • Up to 288 ports in a single switch
  • MPI Performance
  • Latency 5.9 us
  • Bandwidth 860 MB/s
  • Estimated cost/port (based on 64-port
    configuration) 1700 - 3000
  • Switch NIC cable
  • http//www.amaxit.com/amaxcorp/products/mtipp.asp

21
Infiniband - Low Cost
  • Mellanox is offering IB interface chip for 69
  • InfiniHost III Lx
  • Targetting
  • landed on motherboard
  • Entry-level cards

22
Infiniband - High Performance
  • PathScales InfiniPath HTX adapter
  • Connects to HyperTransport
  • 1.5 usec latency
  • 1.8 GB/s bandwidth

23
Ethernet
  • Latency 60 us
  • Bandwidth 100 MB/s
  • Top500 list has ethernet-based systems sustaining
    between 35-59 of peak

24
Application Benefits
25
Bisection Bandwidth
  • Definition If split system in half, what is the
    maximum amount of data that can pass between each
    half?
  • Assuming 1 Gb/s links
  • Bisection bandwidth 1 Gb/s

26
Bisection Bandwidth
  • Assuming 1 Gb/s links
  • Bisection bandwidth 2 Gb/s

27
Bisection Bandwidth
  • Definition Full bisection bandwidth is a network
    topology that can support N/2 simultaneous
    communication streams.
  • That is, the nodes on one half of the network can
    communicate with the nodes on the other half at
    full speed.

28
Large Networks
  • When run out of ports on a single switch, then
    you must add another network stage
  • In example above Assuming 1 Gb/s links, uplinks
    from stage 1 switches to stage 2 switches must
    carry at least 6 Gb/s

29
Large Networks
  • With low-port count switches, need many switches
    on large systems in order to maintain full
    bisection bandwidth
  • 128-node system with 32-port switches requires 12
    switches and 256 total cables

30
Rockstar Topology
  • 24-port switches
  • Not a symmetric network
  • Best case - 41 bisection bandwidth
  • Worst case - 81
  • Average - 5.31

31
Ethernet
  • What we did with 128 nodes and a 13,000 ethernet
    network
  • 101 / port
  • 25/port with our latest Gigabit Ethernet switch
  • Sustained 48 of peak
  • With Myrinet, would have sustained 1 Tflop
  • At a cost of 130,000
  • Roughly 1/3 the cost of the system

32
Processor/Interconnect Roundup
Infiniband
Itanium
Myrinet
PowerPC
Quadrics
Itanium
Infiniband
PowerPC
Myrinet
Pentium
33
Storage
34
Local Storage
  • Exported to compute nodes via NFS

35
Network Attached Storage
  • A NAS box is an embedded NFS appliance

36
Storage Area Network
  • Provides a disk block interface over a network
    (Fibre Channel or Ethernet)
  • Moves the shared disks out of the servers and
    onto the network
  • Still requires a central service to coordinate
    file system operations

37
Parallel Virtual File System
38
Lustre
  • Open Source
  • Object-based storage
  • Files become objects, not blocks

39
Cluster Software
40
Cluster Software Stack
  • Linux Kernel/Environment
  • RedHat, SuSE, Debian, etc.

41
Cluster Software Stack
  • HPC Device Drivers
  • Interconnect driver (e.g., Myrinet, Infiniband,
    Quadrics)
  • Storage drivers (e.g., PVFS)

42
Cluster Software Stack
  • Job Scheduling and Launching
  • Sun Grid Engine (SGE)
  • Portable Batch System (PBS)
  • Load Sharing Facility (LSF)

43
Cluster Software Stack
  • Cluster Software Management
  • E.g., Rocks, OSCAR, Scyld

44
Cluster Software Stack
  • Cluster State Management and Monitoring
  • Monitoring Ganglia, Clumon, Nagios, Tripwire,
    Big Brother
  • Management Node naming and configuration (e.g.,
    DHCP)

45
Cluster Software Stack
  • Message Passing and Communication Layer
  • E.g., Sockets, MPICH, PVM

46
Cluster Software Stack
  • Parallel Code / Web Farm / Grid / Computer Lab
  • Locally developed code

47
Cluster Software Stack
  • Questions
  • How to deploy this stack across every machine in
    the cluster?
  • How to keep this stack consistent across every
    machine?

48
Software Deployment
  • Known methods
  • Manual Approach
  • Add-on method
  • Bring up a frontend, then add cluster packages
  • OpenMosix, OSCAR, Warewulf
  • Integrated
  • Cluster packages are added at frontend
    installation time
  • Rocks, Scyld
Write a Comment
User Comments (0)
About PowerShow.com