Manycore Algorithms - PowerPoint PPT Presentation

1 / 8
About This Presentation
Title:

Manycore Algorithms

Description:

Intel dual-core Xeon 5150. Input: Random List of 226 elements. Concluding Remarks ... to accommodate the unique properties and peculiarities of multi-core structures? ... – PowerPoint PPT presentation

Number of Views:19
Avg rating:3.0/5.0
Slides: 9
Provided by: davida8
Category:

less

Transcript and Presenter's Notes

Title: Manycore Algorithms


1
Manycore Algorithms
  • David A. Bader

2
Driving Manycore Applications
Revolutionary change in applications for the 21st
century!
3
HPCs success has been its failure!
For decades, HPC has been on a vicious cycle of
enabling applications that run well on HPC
systems.
Applications
  • Manycores disruption can open the door for a
    revolutionary transformation!
  • For the first time in decades, manycore will
    allow innovation in real-world algorithms

4
Enabling 21st Century Applications(slightly
modified slide from 7 YEARS ago!)
  • Manycore apps will require
  • Integer performance
  • Strings, lists, trees, graphs
  • Combinatorics
  • Optimization
  • Computational geometry
  • Irregular data accesses
  • Dynamic programming, backtracking
  • Heuristics and solutions to NP-hard problems
  • Innovate tomorrows applications must drive
    manycore!
  • Current HPC systems are designed for
    physics-based simulations that use
  • Floating-point, linear algebra
  • Top 500 List measures Linpack!
  • Regular operations (high-degrees of locality)
  • e.g., Matrices, FFT, CG
  • Low-order polynomial-time algorithms
  • Focus of current HPC/petascale algorithms
  • Dense linear algebra
  • Sparse linear algebra
  • FFT or multi-grid
  • Global scatter-gather operations
  • Dynamically evolving coordinate grids
  • Dynamic load-balancing
  • Particle-based or lattice-gas algorithms
  • Continuum equation solvers

5
May You Live in Tumultuous Times.
May You Live in Interesting Times.
6
Manycore Challenge List Ranking
  • Challenge Given a linked list stored in memory,
    find the distance from each node to the head
  • Sequential approach is trivial (2 lines of code,
    linear time)
  • Linear speedup with the number of processors in
    theory (PRAM)
  • No speedup has ever been reported using MPI
  • Rationale List ranking is the basis for many
    irregular parallel algorithms, and is
    representative of many client applications.

Input Random List of 226 elements
  • SWARM SoftWare and Algorithms for Running on
    Manycore
  • Supported by Microsoft Research Faculty Award in
    Parallelism and Concurrency
  • Library of efficient implementations of parallel
    programming primitives and example kernels
  • Prefix-sums, pointer-jumping, list ranking,
    divide and conquer, pipelining, graph algorithms,
    symmetry breaking, graph algorithms
  • Computational model for analyzing algorithms on
    multimanycore systems
  • Portable
  • Microsoft Visual Studio, Linux, AIX, Solaris
  • Intel Xeon, AMD Opteron, IBM Power6, Sun US T1
  • Shared Source under the Microsoft Permissive
    License (Ms-PL)
  • http//multicore-swarm.sourceforge.net/

7
Concluding Remarks
  • Is Manycore the next CB radio or is it for real?
  • Its here, baby! Sit back, relax, and enjoy the
    ride!
  • What will Multicore do to the computing
    ecosystem?
  • Require real-world algorithmic innovations for
    the first major time in several decades
  • Can innovative algorithmic techniques exploit the
    opportunities and address the challenges of
    multi-core?
  • Absolutely, aided by synergistic architectural
    components
  • How will programming models and supporting system
    software change to accommodate the unique
    properties and peculiarities of multi-core
    structures?
  • Composability is a must. Models must give
    performance advantage for using architectural
    features
  • Challenges for memory hierarchy between memory
    and on chip resources.
  • Processors cheap, memory bandwidth expensive.
    Compute in the memory when possible ?
    transactions!
  • Where will the parallelism come from?
    Hw/Sw/Compiler/OS?
  • The user ? explicitly parallel algorithms, and
    systems ? architecture-driven thread-level
    speculation
  • Is a single programming model the right solution?
  • No pick the right tool for the problem domain
    specialized programming models
  • Do we have to use all those 1024 cores?
  • Yes, while theres no more free lunch in
    computing, most problems can reveal this amount
    of concurrency.

8
Acknowledgment of Support
  • National Science Foundation
  • CSR A Framework for Optimizing Scientific
    Applications (06-14915)
  • CAREER High-Performance Algorithms for
    Scientific Applications (06-11589 00-93039)
  • ITR Building the Tree of Life -- A National
    Resource for Phyloinformatics and Computational
    Phylogenetics (EF/BIO 03-31654)
  • DBI Acquisition of a High Performance
    Shared-Memory Computer for Computational Science
    and Engineering (04-20513).
  • IBM PERCS / DARPA High Productivity Computing
    Systems (HPCS)
  • DARPA Contract NBCH30390004
  • IBM Shared University Research (SUR) Grant
  • Sony-Toshiba-IBM (STI)
  • Sun Academic Excellence Grant
  • Microsoft Research
Write a Comment
User Comments (0)
About PowerShow.com