Title: A HighPerformance Scalable Graphics Architecture
1A High-Performance Scalable Graphics Architecture
- Daniel R. McLachlan
- Director, Advanced Graphics Engineering
- SGI
2Growth in Model Sizes
Source Gartner
Images courtesy of Parametric Technology
Corporation Photodisc, and Magic Earth, LLC
3Problems Are Getting Increasingly Complex Over
Time
4The Complexity of the Simple
Potato Chips
Diapers
Images courtesy of Procter Gamble
5Graphic Cards Are Outpacing PC Architecture and
Bandwidth
Graph based on relative scale.
6Addressing Real Needs
Visualization
- Extreme resolution
- Absolute visual quality
- VAN
- Solving complex problems
- Dense data sets
Performance
Graphics
Clusters
- Low cost
- Fast simple polygons
- Single screen image quality
1992
2003
Visualization Breaks The Cognitive Barrier For
Better Decisions
Images courtesy of Advantage CFD SCI institute
NLM Theoretical Biophysics Group of the Beckman
Institute at UIUC Laboratory for Atmospheres,
NASA Goddard Space Flight Center Donghoon Shin,
Art Center College of Design, Nvidia Corporation
ATI Technologies, Inc and Nintendo Co., Ltd.
7Cluster Comparison
- Cons
- Cumbersome to program
- High administration costs
- Few applications for visualization
- Difficult to scale for large problems
- Difficult to dynamically load balance
- Lack of software productivity tools
- Often requires data replication
- Reliability
- Limited to 2GB memory space
- Pros
- Cheap
- Industry standard
- High display list performance
- Good for embarrassingly parallel problems
- Can potentially scale to 1000s of processors
8The Benefits of Shared Memory
Traditional Clusters SGI NUMAflex
Fast NUMAflex interconnect Global shared memory
node OS
node OS
node OS
node OS
...
1-2 CPUs per node
lt 64 CPUs per node
- What is shared memory?
- All nodes operate on one large shared memory
space, instead of each node having its own small
memory space - Shared memory is high-performance
- All nodes can access one large memory space
efficiently, so complex communication and data
passing between nodes arent needed - Big data sets fit entirely in memory less disk
I/O is needed - Shared memory is cost-effective and easy to
deploy - It requires less memory per node, because large
problems can be solved in big shared memory - Simpler programming means lower tuning and
maintenance costs
9How SGI Onyx Enables the Role System at a
Glance
Scalable Graphics I/O
Scalable Interaction
Appropriate Delivery
Scalable Data
SGI Onyx
CompositorNetwork
Scalable Compute and Large Memory
Large Data Sets
Scalable Graphics
Scalable Disk I/O
Scalable Resolution
Scalable Rendering
10Silicon Graphics Onyx4 UltimateVision
Changing the Application Paradigm
- Moving from a fixed rendering path
Geometry
to a scalable and programmable rendering path.
Application accelerators
Images courtesy of Pratt and Whitney Canada and
Magic Earth, LLC
11ScalingA Shift in Pipe Paradigm
1. Screen-based decomposition
Even more powerful in combination All modes can
be used separately or combined in any number of
ways
2. Eye-based decomposition
3. Time-based decomposition
4. Data-based decomposition
Visible Human public data set
Data courtesy of DaimlerChrysler, Images courtesy
of MAK
12Compositor Flexibility
Multi-Tier Composition Composite output of
multiple compositors e.g., first layer does 2D
composition, second layer does anti-aliasing
Visual Serving Composited output sent to
workstations for viewing and/or editing
13Silicon Graphics Onyx4 UltimateVision System
Architecture
Optional
8GB RAM
Standard I/O or 2 Graphics Pipes
CPU
CPU
Memory Controller
SGI NUMA scalability
CPU
CPU
2 Graphics Pipes
14Conclusion
- Silicon Graphics Onyx4 UltimateVision
- Solving bigger and more complex problems
- Worlds most scalable visualization system
- Up to 32 GPUs in an SSI architecture
- World-leading computational capability
- Up to 64 CPUs per node, scalable to 1024
processors - Solves system b/w limitations of PCs and clusters
- Up to 8 NUMAlink 3 connections to a single shared
memory pool - New-generation programmable graphics architecture
- OpenGL Shading Language
15(No Transcript)