Stanford Streaming Supercomputer (SSS) Project Meeting

About This Presentation

Title:

Stanford Streaming Supercomputer (SSS) Project Meeting

Description:

Bill Dally, Pat Hanrahan, and Ron Fedkiw. Computer Systems Laboratory ... convolve. convolve. Depth Map. Operations within a kernel operate on local data ... – PowerPoint PPT presentation

Number of Views:24

Avg rating:3.0/5.0

Slides: 19

Provided by: william507

Learn more at: http://graphics.stanford.edu

Category:

more less

Transcript and Presenter's Notes

Title: Stanford Streaming Supercomputer (SSS) Project Meeting

1
Stanford Streaming Supercomputer (SSS)Project
Meeting

Bill Dally, Pat Hanrahan, and Ron FedkiwComputer
Systems LaboratoryStanford University
October 2, 2001

2
Agenda

Introductions (now)
Vision subset of ASCI review slides
Goals for the quarter
Schedule of meetings for the quarter

3
Computation is inexpensive and plentiful
nVidea GeForce3 80 Gflops/sec 800 Gops/sec
Velio VC3003 1Tb/s I/O BW
DRAM lt 0.20/MB
4
But supercomputers are very expensive

Cost more per GFLOPS, GUPS, and GByte than low
end machines
Hard to achieve high fraction of peak performance
on global problems
Based on clusters of CPUs that are scaling at
only 20/year vs. 50 historically

5
Microprocessors no longer realize the potential
of VLSI
52/year
19/year
301
74/year
1,0001
30,0001
6
Streaming processors leverage emerging technology

Streaming supercomputer can achieve
20/GFLOPs, 2/M-GUPS
Scalable to PFLOPS and 1013 GUPS
Enabled by
Stream architecture
Exposes and exploits parallelism and locality
High arithmetic intensity (ops/BW)
Hides latency
Efficient interconnection networks
High global bandwidth
Low latency

7
What is stream processing?
Operations within a kernel operate on local data
Kernels can be partitioned across chips to
exploit control parallelism
Image 0
convolve
convolve
Depth Map
SAD
Image 1
convolve
convolve
Streams expose data parallelism
8
Why does it get good performance easily?
9
Architecture of a Streaming Supercomputer
10
Streaming processor
11
A layered software system simplifies stream
programming
12
Domain-specific languageexample Marble shader
in RTSL
float turbulence4_imagine_scalar (texref noise,
float4 pos) fragment float4 addr1 pos
fragment float4 addr2 pos 2, 2, 2, 1
fragment float4 addr3 pos 4, 4, 4, 1
fragment float4 addr4 pos 8, 8, 8, 1
fragment float val val (0.5)
texture(noise, addr1)0 val val
(0.25) texture(noise, addr2)0 val val
(0.125) texture(noise, addr3)0 val
val (0.0625) texture(noise, addr4)0
return val
float3 marble_color(float x) float x2 x
sqrt(x1.0).7071 x2 sqrt(x) return .30
.6x2, .30 .8x, .60
.4x2
surface shader float4 shiny_marble_imagine
(texref noise) float4 Cd lightmodel_diffuse(
0.4, 0.4, 0.4, 1 , 0.5, 0.5, 0.5, 1 )
float4 Cs lightmodel_specular( 0.35, 0.35,
0.35, 1 , Zero, 20) fragment float y
fragment float4 pos Pobj 10, 10, 10, 1 y
pos1 3.0 turbulence4_imagine_scalar(noise,
pos) y sin(ypi) return
(marble_color(y), 1.0f Cd Cs)
13
Stream-level application descriptionexample
SHARP Raytracer
Camera
Grid
Triangles
Rays
Rays
Hits
VoxID
Rays
Rays
Pixels

Computation expressed as streams of records
passing through kernels
Similar to computation required for Monte-Carlo
radiation transport

14
Expected application performance

Arithmetic-limited applications
Includes applications where domain decomposition
can be applied
Like TFLO and LES
Expected to achieve a large fraction of peak
performance
Communication-limited applications
Such as applications requiring matrix solution Ax
b
At the very least will benefit from high global
bandwidth
We hope to find new methods to solve matrix
equations using streaming

15
Conclusion

Computation is cheap yet supercomputing is
expensive
Streams enable supercomputing to exploit
advantages of emerging technology
by exposing locality and concurrency
Order of magnitude cost/performance improvement
for both arithmetic-limited and
communication-limited codes
20/GFLOPS and 2/M-GUPS
Scalable from desktop (1 TFLOPS) to machine room
(1 PFLOPS)
A layered software system using domain-specific
languages simplifies stream programming
MCRT, ODEs, PDEs
Early results on graphics and image processing
are encouraging

16
Plan for AY2001-2002
17
Project Goals for Fall Quarter AY2001-2002

Map two applications to the stream model
Fluid flow (TFLO), and molecular dynamics
candidates
Define a high-level stream programming language
Generalize stream access without destroying
locality
Draft strawman SSS architecture and identify key
issues

18
Meeting Schedule Fall Quarter AY2001-2002

Goal shared knowledge base and vision across the
project
10/9 TFLO (Juan)
10/16 RTSL (Bill M.)
10/23 Molecular Dynamics (Eric)
10/30 Imagine and its programming system
(Ujval)
11/6 C, ZPL, etc SPL brainstorming (Ian)
11/13 Metacompilation (Ben C.)
11/20 Application followup (Ron/Heinz)
11/27 Strawman architecture (Ben S.)
12/4 Streams vs. CMP (Blue Gene/Light, etc)
(Bill D.)

Write a Comment

User Comments (0)

About PowerShow.com

Stanford Streaming Supercomputer (SSS) Project Meeting - PowerPoint PPT Presentation

Stanford Streaming Supercomputer (SSS) Project Meeting

Bill Dally, Pat Hanrahan, and Ron Fedkiw. Computer Systems Laboratory ... convolve. convolve. Depth Map. Operations within a kernel operate on local data ... – PowerPoint PPT presentation