ASCI Red Math Libraries - PowerPoint PPT Presentation

1 / 8

About This Presentation

Title:

ASCI Red Math Libraries

Description:

libwc (write-combine Cougar libraries) ScaLAPACK (Parallel Linear Alegra Package) ... Three versions (like libcsmath), but only available on Cougar ... – PowerPoint PPT presentation

Number of Views:68

Avg rating:3.0/5.0

Slides: 9

Provided by: benc2

Category:

Tags: asci | cougar | libraries | math | red

Transcript and Presenter's Notes

Title: ASCI Red Math Libraries

1
ASCI Red Math Libraries

What Libraries Exist
Libcsmath
Libwc
Others

2
What Math Libraries Exist

Libcsmath (Comp-Sci MATH)
Level 1, 2, and 3 BLAS
1D FFTs
Partial Man pages in R2.8!
LAPACK
BLACS
NX (An integer sum bug has been fixed in R2.8)
MPI
libwc (write-combine Cougar libraries)
ScaLAPACK (Parallel Linear Alegra Package)
PBLAS (Parallel BLAS)

3
LIBCSMATH

Level 1, 2, and 3 BLAS. 1D FFTs
/usr/lib/libcsmath_r.a, /usr/lib/libcsmath_cop.a,
/usr/lib/libcsmath.a
_r Tries to split the BLAS/FFTs for you using
the compiler
_cop Tries to split the BLAS/FFTs for you using
cop
All versions reentrant except on some
level-2,level-3 complex BLAS
If you do your own parallelism, you will want to
explicitly use libcsmath.a.
The official C interface to the BLAS is included.
Linking with -mp or -Mconcur automatically gives
you _r.a.
Dual processor versions are enabled with -proc 2
on the yod line
TFLOP_XDEV/tflops/lib
See the release notes and upcoming man pages
(R2.8)
Works with OSF and Cougar
http//www.cs.utk.edu/ghenry/distrib
Can find a linux version here (around 16000
licenses.)
Other interesting kernels
copsync() xdgemm() transposition routines

4
Libcsmath R2.8 enhancements

C AB case where the number of columns of C (n)
is small.
All DGEMM cases where K is small
K24 GEMM cases
K64 GEMM cases
GEMM cases where the number of rows (M) is 2.
More prefetching done on columns of B
Enhancements to other level-3 kernels
Faster handling of smaller BLAS

5
LIBWC

Write Combine Library
Using the write combine method of accessing
memory as opposed to write back. Write combine
buffers a single cache line and then writes it
directly to memory, instead of loading it first
into cache or keeping it in cache awhile like
write back.
Write Combine library for Cougar
Takes advantage of new Xeon core features
Applicable to any memory-write bound kernel.
Please contact us if you have a use for a tuned
kernel of this nature.
Three versions (like libcsmath), but only
available on Cougar
The compiler does not automatically bring in one
or another unlike libcsmath
Link with -lwc and you must use -wc on the yod
line
libwc versions of memcpy, dcopy, dzero, memset,
memmove, bcopy
Designed for large (1 Mbyte) memory writes.
Other interesting kernels (these can be used by
anything!)
flush_caches() (flushes the caches on one or both
processors)
use_write_combine() (returns 1 if it is safe to
use write combine)
touch1( array, size_of_array_in_bytes) (C and
Fortran versions)

6
LAPACK, PBLAS, ScaLAPACK

/usr/lib/scalapack or TFLOP_XDEV/tflops/lib/scala
pack
liblapack.a, libtmglib.a, libpblas.a,
libscalapack.a, libtools.a, libredist.a
ScaLAPACK and PBLAS depend on BLACS or BLACS_MPI
Sample link lines
L/usr/lib/scalapack -ltmglib -llapack
L/usr/lib/scalapack -lscalapack -lpblas -ltools
-lredist
L/usr/lib/scalapack -lblacsF77init_MPI
-lblacs_MPI -lblacsF77init_MPI -lmpi
L/usr/lib/scalapack -lblacsCinit_MPI -lblacs_MPI
-lblacsCinit_MPI -lmpi
L/usr/lib/scalapack -lblacs -lnx
Recent BLACS Integer sum bug fix found in release
R2.8!

7
A new Optimization Tool

Not yet available only in Alpha on Janus right
now.
Optimizes your (F77) subroutine by trying
different optimization strategies and returning
to you the assembly code corresponding to the
optimal one.
You must provide a greg_timer() and
greg_initialize() routines and link against the
library.
The routine greg_timer() calls opcode_routine()
instead of the target routine to be optimized.
The application runs on Janus for anywhere from a
minute to a day or more depending on input
options.

8
Example

subroutine target_routine_to_optimize(A,
B, C, M, N)
double precision A(), B(), C(), SUM1
integer M, N, I, J
sum1 0.d0
do I 1, M
do J 1, N
sum1 sum1
A((I-1)MJ) B((I-1)MJ)
enddo
C(I) sum1
enddo
return
end
/ New auxillary routines /
define ARRAY_SIZE 1024
int NARRAY_SIZE, MARRAY_SIZE
double AARRAY_SIZEARRAY_SIZE
double BARRAY_SIZEARRAY_SIZE
double CARRAY_SIZE
greg_initialize()

Write a Comment

User Comments (0)

About PowerShow.com

Recommended Relevance Latest Highest Rated Most Viewed

Sort by:

Related More from user

CrystalGraphics Presentations

World's Best PowerPoint Templates PowerPoint PPT Presentation

World's Best PowerPoint Templates - CrystalGraphics offers more PowerPoint templates than anyone else in the world, with over 4 million to choose from. Winner of the Standing Ovation Award for “Best PowerPoint Templates” from Presentations Magazine. They'll give your presentations a professional, memorable appearance - the kind of sophisticated look that today's audiences expect. Boasting an impressive range of designs, they will support your presentations with inspiring background photos or videos that support your themes, set the right mood, enhance your credibility and inspire your audiences.

CrystalGraphics 3D Character Slides for PowerPoint PowerPoint PPT Presentation

CrystalGraphics 3D Character Slides for PowerPoint - CrystalGraphics 3D Character Slides for PowerPoint

Chart and Diagram Slides for PowerPoint PowerPoint PPT Presentation

Chart and Diagram Slides for PowerPoint - Beautifully designed chart and diagram s for PowerPoint with visually stunning graphics and animation effects. Our new CrystalGraphics Chart and Diagram Slides for PowerPoint is a collection of over 1000 impressively designed data-driven chart and editable diagram s guaranteed to impress any audience. They are all artistically enhanced with visually stunning color, shadow and lighting effects. Many of them are also animated. And they’re ready for you to use in your PowerPoint presentations the moment you need them. – PowerPoint PPT presentation

Related Presentations

Thor's Hammer/Red Storm PowerPoint PPT Presentation

Thor's Hammer/Red Storm - Add to commodity selectively (in RS there is basically one truly custom part! ... Red Storm uses custom parts only where they are critical to performance and ... | PowerPoint PPT presentation | free to view

SOS8 PowerPoint PPT Presentation

SOS8 - Big and Not so Big Iron at SNL SNL CS R&D Accomplishment Pathfinder for MPP Supercomputing Our Approach Large systems with a few processors per node ... | PowerPoint PPT presentation | free to view

Scaling in Numerical Linear Algebra PowerPoint PPT Presentation

Scaling in Numerical Linear Algebra - Susan Blackford, UT. Jaeyoung Choi, Soongsil U. Andy Cleary, LLNL. Ed ... Jack Dongarra, UT/ORNL. Sven Hammarling, NAG. Greg Henry, Intel. Osni Marques, NERSC ... | PowerPoint PPT presentation | free to view

Scaling in Numerical Linear Algebra PowerPoint PPT Presentation

Scaling in Numerical Linear Algebra - ... TOPS 500, by year .13M. 6768 .3. 1 .28. Intel Paragon XP/S MP. 1995. ... Parallel time = O( tf N3/2 / P tv ( N / P1/2 N1/2 P log P ) ) Performance model 2 ... | PowerPoint PPT presentation | free to view

CS 267 Applications of Parallel Computers Dense Linear Algebra PowerPoint PPT Presentation

CS 267 Applications of Parallel Computers Dense Linear Algebra - ... symmetric or Hermitian matrices ... optimal memory hierarchy performance See references at http://lawra.uni ... on simple layouts Performance modeling to ... | PowerPoint PPT presentation | free to view

CS 267 Applications of Parallel Computers Dense Linear Algebra PowerPoint PPT Presentation

CS 267 Applications of Parallel Computers Dense Linear Algebra - CS267 Dense Linear Algebra I.1. Demmel Fa 2001. CS 267 Applications of ... into triangular facets using standard modeling tools ... .edu/~stanley/gbell/index. ... | PowerPoint PPT presentation | free to view

Ensuring%20Our%20Nation PowerPoint PPT Presentation

Ensuring%20Our%20Nation - Computational Challenges and Directions in the Office of Science Science for DOE and the Nation www.science.doe.gov NCSX Fred Johnson Advanced Scientific Computing ... | PowerPoint PPT presentation | free to view

CS 267 Applications of Parallel Computers Dense Linear Algebra PowerPoint PPT Presentation

CS 267 Applications of Parallel Computers Dense Linear Algebra - CS267 Dense Linear Algebra I.1. Demmel Fa 2002. CS 267 Applications of Parallel Computers ... How fast can you solve dense Ax=b? ... | PowerPoint PPT presentation | free to view

CS 267 Applications of Parallel Processors Lecture 9: Computational Electromagnetics - Large Dense Linear Systems PowerPoint PPT Presentation

CS 267 Applications of Parallel Processors Lecture 9: Computational Electromagnetics - Large Dense Linear Systems - Computational Electromagnetics - Sources of large dense linear systems ... Computational Electromagnetics (MOM) ... computational electromagnetics and linear systems ... | PowerPoint PPT presentation | free to view

Optimization%20and%20Model%20Insight%20Research%20Directions%20at%20Sandia%20National%20Laboratories PowerPoint PPT Presentation

Optimization%20and%20Model%20Insight%20Research%20Directions%20at%20Sandia%20National%20Laboratories - Optimization and Model Insight Research Directions at Sandia National Laboratories | PowerPoint PPT presentation | free to view

Reconnect PowerPoint PPT Presentation

Reconnect - Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, ... In development: ALPS/BiCePs/BLIS ... | PowerPoint PPT presentation | free to view

Scaling in Numerical Linear Algebra PowerPoint PPT Presentation

Scaling in Numerical Linear Algebra - Sparse Direct Solvers for Ax= b. Automatic Performance Tuning of Sparse Kernels ... Antoine Petitet, UT. Ken Stanley, UCB. David Walker, Cardiff U. Clint Whaley, UT ... | PowerPoint PPT presentation | free to view

A Quick Look at Software and Trends in High Performance Computing for Linear Algebra PowerPoint PPT Presentation

A Quick Look at Software and Trends in High Performance Computing for Linear Algebra - A Quick Look at Software and Trends in High Performance Computing for Linear Algebra | PowerPoint PPT presentation | free to view

CS 267 Applications of Parallel Computers Lecture 19: Dense Linear Algebra - I PowerPoint PPT Presentation

CS 267 Applications of Parallel Computers Lecture 19: Dense Linear Algebra - I - CS267 L19 Dense Linear Algebra I.1. Demmel Sp 1999. CS 267 Applications of Parallel Computers ... Si and sapphire crystals of up to 3072 atoms ... | PowerPoint PPT presentation | free to view

Linear Algebra Libraries PowerPoint PPT Presentation

Linear Algebra Libraries - ... more participants (not all!) Jack Dongarra, U. Tennessee. Kathy Yelic, ... Using a red-black algorithm, titanium arrays (191 Mflops) are faster than Java arrays ... | PowerPoint PPT presentation | free to view

Architecture of Parallel Computers CSC ECE 506 BlueGene Architecture PowerPoint PPT Presentation

Architecture of Parallel Computers CSC ECE 506 BlueGene Architecture - ... the state of the art of scientific simulation. Advance the state of the art in computer design and ... sPPM (Spare Matrix Multiple Vector Multiply), UMT2000: ... | PowerPoint PPT presentation | free to view

CS 267 Dense Linear Algebra: Parallel Gaussian Elimination PowerPoint PPT Presentation

CS 267 Dense Linear Algebra: Parallel Gaussian Elimination - Title: CS267: Introduction Author: David E. Culler Last modified by: James Demmel Created Date: 1/20/1997 7:06:50 AM Document presentation format | PowerPoint PPT presentation | free to view

CS 267 Applications of Parallel Processors Lecture 13: Parallel Matrix Multiply PowerPoint PPT Presentation

CS 267 Applications of Parallel Processors Lecture 13: Parallel Matrix Multiply - The problem solved was for a matrix of size 48,672. (The world record in 1991.) 267 Lecture 13 ... Current Records for Solving Dense Systems. Year System Size ... | PowerPoint PPT presentation | free to view

The Cray XT4 Programming Environment PowerPoint PPT Presentation

The Cray XT4 Programming Environment - The Cray XT4 Programming Environment Getting to know CNL Disclaimer This talk is not a conversion course from Catamount, it makes assumptions that attendees know Linux. | PowerPoint PPT presentation | free to view

A Look at Library Software for Linear Algebra: Past, Present, and Future PowerPoint PPT Presentation

A Look at Library Software for Linear Algebra: Past, Present, and Future - A Look at Library Software for Linear Algebra: Past, Present, and Future Jack Dongarra University of Tennessee and Oak Ridge National Laboratory | PowerPoint PPT presentation | free to view

Reconnect PowerPoint PPT Presentation

Reconnect - Reconnect 04 Introduction to PICO Cynthia Phillips, Sandia National Laboratories Joint work with: Jonathan Eckstein, Rutgers William E. Hart, Sandia National ... | PowerPoint PPT presentation | free to view

High Performance Computing and Trends, Enhancing Performance, Measurement Tools, PowerPoint PPT Presentation

High Performance Computing and Trends, Enhancing Performance, Measurement Tools, - High Performance Computing and Trends, Enhancing Performance, Measurement Tools, | PowerPoint PPT presentation | free to view

CS 267 Dense Linear Algebra: Parallel Gaussian Elimination PowerPoint PPT Presentation

CS 267 Dense Linear Algebra: Parallel Gaussian Elimination - Motivation, overview for Dense Linear Algebra. Review Gaussian Elimination (GE) for ... Rest of DLA what's it like (not GEPP) Missing from ScaLAPACK - projects ... | PowerPoint PPT presentation | free to view

CS 267 Dense Linear Algebra: Parallel Gaussian Elimination PowerPoint PPT Presentation

CS 267 Dense Linear Algebra: Parallel Gaussian Elimination - ... .50 36950.80 31567.00 3000.00 21280.79 11726.80 43499.70 34729.00 4000.00 23899.00 16053.70 45992.20 36972.00 5000.00 27097.69 19615.30 47437.90 37889.00 6000 ... | PowerPoint PPT presentation | free to view

CS 267 Dense Linear Algebra: Parallel Gaussian Elimination PowerPoint PPT Presentation

CS 267 Dense Linear Algebra: Parallel Gaussian Elimination - ... 80 35327.00 28964.00 10000.00 25946.11 23761.80 35628.20 29226.00 11000.00 26531.72 24562.00 35940.70 29170.00 12000.00 25897.24 25309.80 36132.90 29512.00 ... | PowerPoint PPT presentation | free to view

Visualization on New York Blue PowerPoint PPT Presentation

Visualization on New York Blue - New York Blue needs a state-of-art parallel visualization software. Negative features ... Blue Gene/P: 1-PFLOPS, 294,912-processors, 72-racks ... | PowerPoint PPT presentation | free to view

SCIENTIFIC DATA MANAGEMENT PowerPoint PPT Presentation

SCIENTIFIC DATA MANAGEMENT - SCIENTIFIC DATA MANAGEMENT | PowerPoint PPT presentation | free to view