Protocols and software for exploiting Myrinet clusters - PowerPoint PPT Presentation

About This Presentation
Title:

Protocols and software for exploiting Myrinet clusters

Description:

Performance of workstations and PCs is rapidly improving ... Use DMA both on the send side and receive side: higher bandwidth, offload the CPU ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 23
Provided by: PhamCo2
Category:

less

Transcript and Presenter's Notes

Title: Protocols and software for exploiting Myrinet clusters


1
Protocols and software for exploiting Myrinet
clusters
  • Congduc Pham
  • and the main contributors
  • P. Geoffray,
  • L. Prylli,
  • B. Tourancheau,
  • R. Westrelin

2
Parallel machines and clusters
Cplant
Standalone workstation
3
Pros for clusters
  • Large supercomputers are expensive and suffer
    from a short useful life span
  • Performance of workstations and PCs is rapidly
    improving
  • The communications bandwidth between workstations
    is increasing as new networking technologies and
    protocols are implemented in LANs and WANs.
  • Workstation clusters are easier to integrate into
    existing networks than special parallel
    computers.
  • Use of clusters of workstations as a distributed
    computing resource is very cost effective -
    incremental growth or update of system!!!

4
No polemical discussion, just statement
Mainframe
PC
Workstation
Mini Computer
1984
Vector Supercomputer
GigaEthernet Giganet SCI Myrinet
from R. Buyya
5
The Myrinet technology
  • Switch
  • full crossbar
  • wormhole source routing
  • small latency
  • Network interface
  • embedded RISC processor
  • programmable
  • local memory
  • several DMA engines

Current specifications Up to 200Mhz
processor Up to 8MB local memory 64bit/66Mhz PCI
bus (528 MB/s peak) 250 MB/s full duplex links
6
The raw performance is here, but
  • the traditional communication software fail to
    bring the hardware performance to the applications

200mph
40mph
35mph
Myrinet
Traditional communication layers
180mph
175mph
Optimized communication layers
7
Going faster by taking shortcuts
8
Our communication architecture
  • Provides a complete suite for high-performance
    communications.Focus on Myrinet-based clusters
  • Viewed as layers, but by-passes as much as
    possible the OS

MPI-BIP
BIP
BIP-SMP
programmable NICs break the traditional spatial
distribution of tasks
Myrinet physical layer
9
BIP, the lowest protocol level
  • Basic Interface for Parallelism
  • very basic API
  • provides a library, a kernel module and a MCP
  • definitely not for the end-user
  • Optimizations for
  • latency
  • maximum throughput
  • the throughput increase
  • The implementation performs
  • reduction of the data critical path
  • distinction between small and large messages
  • burst or write combining for host?NIC
  • optimal cache usage
  • cache snooping for NIC ?host (monitoring of the
    PCI bus)
  • buffer alignment
  • optimal fragment size

10
BIP, small message strategy
  • Avoids handshakes between the host and the NIC
  • Uses PIO to a NIC FIFO on the sending side and an
    extra memory copy on the receiving side

11
BIP, large message strategy
  • Use DMA both on the send side and receive side
    higher bandwidth, offload the CPU
  • Zero-copy mechanism, pipelined transmission

12
BIP-SMP a low level for SMP machines
  • SMP viewed as best performance/price ratio
    architectures (2 or 4 proc.)
  • BIP-SMP provides
  • manage concurrent accesses to the NIC
  • low latency intra-node communications
  • BIP equivalent inter-node communication
  • total transparency for the applications and
    end-users

0 1 2 3
13
BIP-SMP Moving data between processes
14
MPI-BIP the communication middleware
  • MPI-BIP adds high-level features to BIP
  • based on the MPICH implementation
  • provides a portable and widely-used API
  • implements a credit-based flow control for small
    messages
  • request FIFO for multiple non-blocking operations
  • provides segmentation/reassembly features to
    avoid timeouts

15
Working with the BIP software suite
  • installation
  • run configure
  • compilation and linkage
  • several libraries bip, bip-smp, mpi
  • compile with bipcc
  • Submitting jobs and monitoring nodes
  • run myristat to know which nodes are available
  • run bipconf to configure the virtual machine
  • use bipload to lunch programs

16
WebCM a high level management tool
  • web-based management tool
  • integrates existing solutions into a common
    framework

17
The WebCM user interface
  • graphical interface for myristat and bipconf
  • allows submission of jobs through batch packages
  • shows the user's virtual machine definition and
    the user's runnning processes
  • addition of fonctionnalities is performed by
    incorporating new software packages

18
Latency BIP and MPI-BIP
19
Throughput BIP and MPI-BIP
20
BIP-SMP intra-node communications
21
BIP-SMP inter-node communications
22
What run on our clusters?
  • Genomic simulation
  • Fluid dynamic
  • Discrete Event Parallel Simulation
  • Distributed Shared Memory System
  • Want to know more?
  • getting the distribution
  • getting the documentation

http//resam.univ-lyon1.fr
Write a Comment
User Comments (0)
About PowerShow.com