Reality Engine - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

Reality Engine

Description:

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL. Topics. Look at RE pipeline ... RE stole memory cycles to refresh display. Commodity parts lowered cost of RE. 20 ... – PowerPoint PPT presentation

Number of Views:60
Avg rating:3.0/5.0
Slides: 32
Provided by: anselmo9
Category:
Tags: engine | reality | stole

less

Transcript and Presenter's Notes

Title: Reality Engine


1
Reality Engine
  • Anselmo Lastra
  • COMP290-052

2
Topics
  • Look at RE pipeline
  • Examine the data that flows between stages
  • Look at bandwidths

3
Reality Engine
  • First OpenGL machine
  • Change from proprietary IrisGL
  • Fastest commercial machine of time
  • Generations
  • Akeleys definitions
  • Begins at flat-shaded polygons

4
Design Goals
  • ½ million textured tris/sec
  • Mip-mapped textures
  • Antialising
  • High fill rate
  • Work as well as VGX on 2nd gen work

5
Three types of Boards
  • Geometry board
  • 6, 8, 12 geometry processors
  • Raster memory board
  • 1, 2, 4 boards to increase fill rate
  • Or antialiasing capability
  • Display/video generator

6
Command Processor
  • Controls work sent to geometry engines
  • Broadcasts some state info
  • Send tris to particular GE
  • Round-robin assignment
  • Static load balancing
  • Sizes of primitives (t-strips) sent to GEs
    important
  • Primitive ordering
  • FIFOs between stages

7
Geometry Engines
  • Intel i860
  • RISC, pipelined FP
  • All polygons converted to tris
  • Single precision computation
  • Typical
  • Broadcasts data for rasterization
  • Note also path to load code into DRAM

8
Fragment Generators
  • Custom ASICs
  • Each a portion of frame buffer
  • Interleaved
  • Pipeline with several fragments in flight
  • Tasks z and color for center, coverage mask,
    texture addresses, texture lookup, final color
    computation (blend, fog)
  • Care to keep color in triangle (not always
    center)
  • Talk about fragment generation / rasterization
    later

9
Subpixel mask
  • Fixed 8x8 grid
  • Select 4, 8, or 16 samples on grid
  • Computer coverage of samples
  • Only one depth and texture coord chosen
  • Depths expanded later from dX, dY
  • Color at center also

10
Texturing
  • Texture replicated at each fragment processor
  • 5-20 times
  • Eight DRAM chips
  • One for each mipmap sample

11
Image Engine
  • 16 per fragment generator
  • One DRAM each
  • Each computes depth at subpixel covered by
    fragment
  • Bits/pixel depends on of boards and display
    resolution (256, 512, 1024)
  • 12 bits / color component
  • 32 bit depth

12
Display Board
  • Display color computed by image engines every
    fragment
  • OpenGL has no explicit end-of-frame
  • 50MHz single bit paths to board from each image
    engine
  • Color maps, etc.

13
Antialiasing
  • Alpha
  • Coverage on 8x8 grid computed
  • Ordering must be observed

14
Multisample antialiasing
  • Point sampling
  • Including accurate edges
  • Not always good representation of actual area
  • Area sampling
  • Can produce artifacts
  • Screen-door transparency
  • Alpha to coverage

15
Texturing
  • Default is 16-bits / texel
  • Because of bandwidth issues
  • Can increase to 32 or 48 bits

16
Clipping
  • FIFOs even out load
  • MIMD better for clipping
  • SIMD must execute wasted cycles to compute both
    if and else
  • Far and near planes
  • Also less clipping because rasterizers scissor
    the primitives

17
Antialiasing
  • Single pass
  • Multisample
  • vs. A-Buffer
  • Makes case for utility of supersampling as
    opposed to multi-pass
  • Less overall hdw
  • Transform/texture only once

18
Triangle Bus
  • Argues that doing sort before fragment engines
    better than after
  • Compares to ES
  • Notes PixelFlow frame latency
  • Lets defer discussion until we talk about
    sorting classification of parallel rendering

19
Commodity DRAM
  • Many other machines used specialized video RAM
  • RE stole memory cycles to refresh display
  • Commodity parts lowered cost of RE

20
Data Flow of RE
  • What flows between stages?

21
Bandwidths
  • Compute bandwidths (at dots) required to render a
    million 50-pixel triangles
  • Well make assumptions, such as all visible

22
Retained vs. Immediate
  • Retained
  • Potentially better performance when bandwidth to
    host a problem
  • Difficult if much of model is edited
  • Immediate
  • Adds-ons to cache data on GPU
  • Often have scene graph anyway

23
Geometry
  • Often vertices treated independently
  • Primitives reassembled
  • Parallelism at fine or coarse scale
  • Vertex 20-30 bytes
  • About 50-100 Flops/vertex

24
Rasterization/Fragment
  • Output is pixels triangles
  • Standard then - 50-100 pixel tris
  • Total about 50M fragments for our example

25
Texturing
  • Per fragment cost is 8 32-bit texels
  • About 1.6GB / million tris
  • Can use compression here
  • Modern systems use caching
  • How fast a memory would we need for 10 M tris?

26
Frame Buffer Bandwidth
  • Assume z of 32 bits, color 4 bytes
  • Could go with z of 24 bits
  • Must read Z for every fragment
  • 200 MB/s for Z
  • Must write some fraction
  • Say ½
  • 400 MB/s for Z and color
  • Total 600 MB/s

27
Frame Buffer Clear
  • Tricks to avoid clearing
  • Example One indicator bit cleared at start of
    frame

28
Scanout Bandwidth
  • 3 bytes/pixel
  • 1024x1024, approx. 1M pixels
  • 60 Hz
  • About 180 MB/s
  • 1280x1024, 72Hz, 280 MB/s
  • Can go over 500 MB/s for HDTV at 72Hz

29
Modeling
  • We have found spreadsheet performance model
    useful
  • Can explore what-if
  • Functional model in HLL
  • Gate-level model in HLL or Verilog

30
Additional Reading
  • The architecture of the pipeline
  • Mark Segal and Kurt Akeley, The design of the
    OpenGL graphics interface

31
Next Time
  • Rasterization
  • Pineda, "A parallel algorithm for polygon
    rasterization", SIGGRAPH 88, and
  • Olano, Marc and Trey Greer, "Triangle Scan
    Conversion Using 2D Homogeneous Coordinates",
    Proceedings of the 1997 SIGGRAPH/Eurographics
    Workshop on Graphics Hardware.
  • Supplementary reading  Joel McCormack, Robert
    McNamara, "Tiled polygon traversal using
    half-plane edge functions", Graphics Hardware
    2000.  The Pineda paper presents a method to
    decide whether a fragment is in a triangle, but
    only sketches ways to fidn the candidate
    fragments.  This paper describes a method to
    traverse triangles in tile order (good for
    caching).
Write a Comment
User Comments (0)
About PowerShow.com