Microsoft HPC Institute BiAnnual Meeting - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Microsoft HPC Institute BiAnnual Meeting

Description:

24 Custom Built Nodes from TeamHPC. Dual socket AMD Opteron 265 (Dual Core) 1. ... MathWorks, NAG, NEC, PGI, SUN, Visual Numerics, by most of Linux distribution ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 25
Provided by: jack237
Category:

less

Transcript and Presenter's Notes

Title: Microsoft HPC Institute BiAnnual Meeting


1
Microsoft HPC Institute Bi-Annual
Meeting
Jack Dongarra Innovative Computing
LaboratoryUniversity of Tennessee and Oak Ridge
National Laboratory
2
Our Microsoft Cluster
  • 24 Custom Built Nodes from TeamHPC
  • Dual socket AMD Opteron 265 (Dual Core) 1.8GHz
    Processors (total of 96 processors)
  • 4GB Ram / node
  • 80GB SATA Hard Drive / node
  • Windows Server 2003 R2 x64 Edition
  • Microsoft Compute Cluster Edition 2003
  • Nforce Gigabit NIC
  • Silverstorm 10Gb/s Infiniband NICs
  • Coming soon
  • Mellanox 20Gb/s DDR Infiniband NICs
  • Drivers dont support dual cards today
  • Myricom 10G 16 port switch NICs

3
Three Thrust Research Areas
  • Numerical Linear Algebra Algorithms and Software
  • BLAS, LAPACK, ScaLAPACK, PBLAS, ATLAS
  • Numerical Libraries for Multicore
  • Self Adapting Numerical Algorithms (SANS) Effort
  • Generic Code Optimization, ATLAS
  • LAPACK For Clusters easy access to clusters
  • Access to clusters for linear algebra sw via
    Matlab, Mathematica, Python, etc on the front end
  • Heterogeneous Distributed Computing
  • PVM, MPI
  • GridSolve, FT-MPI, Open-MPI
  • Performance Evaluation
  • PAPI, HPC Challenge

4
GridSolve
  • Grid based hardware/software/data server
  • RPC-style (GridRPC) clients
  • Do not need to know about services in advance
  • Agent, servers, proxy
  • Service discovery, dynamic problem solving, load
    balancing, fault tolerance, asynchronicity,
    disconnected operation, NAT-tolerance
  • Easy, transparent access to resources
  • Clients Matlab, C, Fortran NetSolve
    Mathematica, Octave, Java
  • Ease of Use Paramount Goal

5
GridSolve Architecture
Resource discovery Scheduling Load
balancing Fault tolerance
Agent
request
server list
cluster
data
cluster
result
cluster
Client
cluster
6
GridSolve on Windows Cluster
  • Several efforts to getting GridSolve to work with
    Windows
  • Native Windows client (Note client only)
  • Options
  • Using Cygwin (Agent and Server )
  • Using SUA (Subsystem for Unix-based Apps
    Interix, SFU )
  • Under development Native Windows agent and server

7
Performance Evaluation Toolshttp//icl.cs.utk.edu
/papi
  • Performance Application Programming
    Interface (PAPI)
  • A portable library to access hardware counters
    found on processors
  • Provides a standardized list of performance
    metrics
  • KOJAK (Joint with Felix Wolf)
  • Software package for the automatic performance
    analysis of parallel apps
  • Message passing and multi-threading
    (MPI and/or OpenMP)
  • Parallel performance
  • CPU and memory performance

8
Wheres PAPI
  • PAPI runs on most modern processors and Operating
    Systems of interest to HPC
  • IBM POWER3,4,5 / AIX
  • POWER4,5 / Linux
  • PowerPC-32 and -64 / Linux
  • Blue Gene / CNK
  • Intel Pentium II, III, 4, M, D, EM64T, etc. /
    Linux
  • Intel Itanium
  • AMD Athlon, Opteron / Linux
  • Cray T3E, X1, XD3, XT3 / Catamount
  • Altix, Sparc,

9
Perfometer
Call Perfometer(red)
Call Perfometer(blue)
10
Tools Using PAPI, eg Perfometer
11

12
PAPI Design
13
PAPI / Windows Limitations
  • Counter State isnt saved on context switch
  • Can only count cpu-wide
  • Cant migrate tasks (using processor affinity)
  • Cant share counters among users
  • Need kernel modifications
  • To preserve counter state on context switch

14
Linear Algebra Software Packageshttp//icl.cs.utk
.edu/lapack/
  • LAPACK
  • Used by Matlab, Mathematica, Numeric Python,
  • Tuned version provided by vendors AMD, Apple,
    Compaq, Cray, Fujitsu, Hewlett-Packard, Hitachi,
    IBM, Intel, MathWorks, NAG, NEC, PGI, SUN,
    Visual Numerics, by most of Linux distribution
    (Fedora, Debian, Cygwin,...).
  • On going work Multi-core, performance, accuracy,
    extended precision, ease of use
  • ScaLAPACK
  • Parallel implementation of LAPACK scaling on
    parallel hardware from 10s to 100s to 1000s of
    processors
  • On going work Target new architectures, new
    parallel environment. For example port to
    Microsoft HPC cluster solution
  • LAPACK for Clusters (LFC)
  • Most of ScaLAPACK functionality from serial
    clients (Matlab, Python, Mathematica)
  • On going work Looking at sparse data and I/O
    scenarios, web services

15
Parallelism in LAPACK / ScaLAPACK
Distributed Memory
Shared Memory
ScaLAPACK
LAPACK
PBLAS
ATLAS
Specialized BLAS
Parallel
BLACS
threads
MPI
16
Installation and testing of LAPACK, BLACS, and
ScaLAPACK
  • Uses Intel ifort and icl , BLAS Intel MKL
  • No problem at installation, LAPACK, BLACS and
    ScaLAPACK uses Makefile with make.inc files to
    set the environment variables
  • Tests are fine
  • ( Note that the BLACS testing is a fairly hard
    test for MPI, and LAPACK a fairly hard test for
    IEEE and compiler semantic.)
  • Can be used from Microsoft Visual Studio
  • DLL coming soon

17
LFC Overview
  • Client/Server
  • Separation via sockets
  • Server objects are large
  • Server runs in parallel
  • Client objects hold references to server objects
  • Process spawning
  • Separation via system calls
  • mpirun/mpiexec start parallel job

. . .
. . .
Tunnel (IP, TCP, ....)
Firewall (perhaps)
18
FT-MPI http//icl.cs.utk.edu/ft-mpi/
  • Define the behavior of MPI in event a failure
    occurs at the process level.
  • FT-MPI based on MPI 1.3 (plus some MPI 2
    features) with a fault tolerant model similar to
    what was done in PVM.
  • Complete reimplementation, not based on other
    implementations.
  • Gives the application the possibility to recover
    from a process-failure.
  • A regular, non fault-tolerant MPI program will
    run using FT-MPI.
  • What FT-MPI does not do
  • Recover user data (e.g. automatic check-pointing)
  • Provide transparent fault-tolerance
  • Open-MPI for MS

19
Open-MPI Collaborators
  • Los Alamos National Lab
  • (LA-MPI)
  • Sandia National Lab
  • Indiana U
  • (LAM/MPI)
  • U of Tennessee
  • (FT-MPI)
  • HLRS - U of Stuttgart
  • (PACX-MPI)
  • U of Houston
  • Cisco Systems
  • Mellanox
  • Voltaire
  • Sun
  • IBM
  • Myricom
  • URL www.open-mpi.org

20
Open-MPI - A Convergence of Ideas
FT-MPI (U of TN)
Open MPI
LA-MPI (LANL)
LAM/MPI (IU)
PACX-MPI (HLRS)
OpenRTE
Fault Detection (LANL, Industry)
FDDP (Semi. Mfg. Industry)
Resilient Computing Systems
Robustness (CSU)
Grid (many)
Autonomous Computing (many)
21
HPC Challenge Goals
  • To examine the performance of HPC architectures
    using kernels with more challenging memory access
    patterns than HPL
  • HPL works well on all architectures ? even
    cache-based, distributed memory multiprocessors
    due to
  • Extensive memory reuse
  • Scalable with respect to the amount of
    computation
  • Scalable with respect to the communication volume
  • Extensive optimization of the software
  • To complement the Top500 list
  • To provide benchmarks that bound the performance
    of many real applications as a function of memory
    access characteristics ? e.g., spatial and
    temporal locality

22
HPCS/HPCC Performance Targets
  • HPL linear system solve Ax b
  • STREAM vector operations A B s C
  • FFT 1D Fast Fourier Transform Z fft(X)
  • RandomAccess integer update Ti XOR( Ti,
    rand)

Memory Hierarchy
Registers
Instructions
Operands
Cache(s)
Lines
Blocks
Local Memory
Performance Targets
Messages
Remote Memory
HPC Challenge
Pages
Disk
Tape
  • HPCC was developed by HPCS to assist in testing
    new HEC systems
  • Each benchmark focuses on a different part of
    the memory hierarchy
  • HPCS performance targets attempt to
  • Flatten the memory hierarchy
  • Improve real application performance
  • Make programming easier

23
(No Transcript)
24
Testbed for Benchmarking
  • Would like to set up a cluster with different
    interconnects
  • GigE
  • Various Infiniband
  • Myrinet
  • Etc
  • Differ OSs
  • Linux
  • Windows
  • Make available to community for testing
Write a Comment
User Comments (0)
About PowerShow.com