Apple HPC Tools - PowerPoint PPT Presentation

1 / 7
About This Presentation
Title:

Apple HPC Tools

Description:

G5 has 2 units: vector permute and vector ALU along with a streaming prefetch unit ... Register sum loads one value in to the register and adds it to itself repeatedly ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 8
Provided by: chrism54
Category:
Tags: hpc | apple | theregister | tools

less

Transcript and Presenter's Notes

Title: Apple HPC Tools


1
Apple HPC Tools
  • Chris Mueller
  • July 6, 2004

2
Altivec
  • http//developer.apple.com/hardware/ve
  • SIMD, 32x128-bit registers, 160 opcodes
  • G5 has 2 units vector permute and vector ALU
    along with a streaming prefetch unit

3
Altivec Programming
  • -faltivec
  • C programming model
  • Variables vector lttypegt varname
  • Functions vec_ltopgt(arg, )
  • Caveats
  • Data must be 16 byte-aligned
  • Its WYSIWYG, the compiler wont rearrange code
  • However, the compiler will add loads/stores as
    needed within vector function calls

4
Example
void VecAdd(unsigned char data, long len,
unsigned char result) // Diagnol sum of
all the values in data long i 0 vector
unsigned char score, score1, score2, vperm,
newsum newsum vec_splat_u8(0) //
create a constant for(i 0 i lt len - 16
i) // Load each vector if((i
0x0000000f) 0) // aligned case
score vec_ld(0, (datai)) else
// unaligned case
score1 vec_ld(0, (datai)) score2
vec_ld(16, (datai)) vperm
vec_lvsl(0, (datai)) score
vec_perm(score1, score2, vperm)
newsum vec_add(score, newsum)
vec_st(newsum, 0, result) // aligned
store return
5
Performance
  • Standard sum adds up all values in an array
  • Vector sum performs a diagnol sum of all
    vectors in an array
  • Register sum loads one value in to the register
    and adds it to itself repeatedly

6
Shark
  • /Developer/Applications/Performance
    Tools/CHUD/Shark
  • Apples main (high) performance measuerment tool
  • Samples all running applications and reports
    hotspots and offers optmization suggestions
  • Nifty source/asm viewer
  • Compile with
  • -g -DCOPY_PHASE_STRIPNO

7
amber/simg5/scrollpv
  • Cycle accurate trace (amber), simulation (simg5)
    and visualization (scrollpv)
  • Usage
  • Warning amber turns off one CPU. Prematurely
    killing amber can leave you with one CPU!!!

amber -I -x 5000 ltprogramgt simg5
thread_001.tt6e 5000 100 1 simg5 -p 1 -b 1 -e
5000 scrollpv -pipe trace_001.pipe -config
trace_001.config
Write a Comment
User Comments (0)
About PowerShow.com