Student - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Student

Description:

... shadow, Antialiasing, Reflection, Refrection, Transparency, Glossy effect etc. ... shadow, Anti-aliasing, Reflection, Refraction, Transparency, Glossy effect etc. ... – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 27
Provided by: Pari155
Category:
Tags: student

less

Transcript and Presenter's Notes

Title: Student


1
Hardware Accelerator for Ray Tracing
Student Parikshit Patidar Supervisor Prof
Anshul Kumar Dr Prem Kalra PhD Involved Anup
Gangwar
2
Contents
  • Introduction
  • Objective and Motivation
  • Scope
  • Ray tracing and RayLab
  • Hardware Accelerator and its performance
    estimation
  • Memory wrapper
  • Accelerator
  • Work done
  • Work to be done

3
Introduction
  • Ray-tracing is a rendering technique that
    calculates an image of a scene by simulating the
    way rays of light travel in the real world.
  • Hardware accelerator for ray tracing is a
    dedicated device that improves the speed of
    calculation of final image by performing the
    computation intensive tasks in parallel and
    provides an interface to program that can use it.

4
Objective
  • The objective of the project is to design a
    hardware that accelerates the speed of
    computation of final image by embedding all the
    computation intensive routines and provides a
    interface to it.

5
Accelerator
Host CPU
PCI Interface
Memory
6
Design
Raylab Ray tracer
Interface (Software interface Wrapper)
Accelerator
7
Motivation
  • Ray Tracing is the most powerful photo realistic
    image generation technique
  • Simple and easy to implement in hardware
  • Can be easily extended to add additional effects
    like shadow, texture map
  • Ray tracing is amenable to massive parallelism
    given the proper architecture.

8
Scope
  • The project mainly deals with the optimization of
    the available
  • Ray Tracer i.e. RayLab. RayLab is a ray tracer
    that has following
  • capabilities
  • Primitive Objects Sphere, Cone, Cylinder,
    Parallelopiped, Torus, Parallelogram, Triangle,
    Ellipsiod etc
  • Constructive Solid Geometry With the help of
    basic objects it creates other arbitraty shapes.
  • Light Source Area and Point light source.
  • Advanced Effects Soft shadow, Antialiasing,
    Reflection, Refrection, Transparency, Glossy
    effect etc.

9
Scope (contd.)
  • The ADM-XRC supports high performance PCI
    operation
  • It has the following features
  • Two high speed DMA controllers that can achieve
    rates of 120Mbytes/sec.
  • Physically conformant to IEEE P1386 Common
    Mezzanine Card standard
  • User clock programmable between 0.5MHz and 100MHz
  • Local bus speeds of up to 40MHz

10
Ray Tracing
  • Ray Tracing is a rendering method that generates
    an image of
  • given mathematical scene. The light ray is shot
    for each pixel on
  • the view plane (monitor screen) towards the
    object in the scene.
  • The pixels color is set to that objects color
    which has closest
  • point of intersection.

11
Light Source
Scene
Intersection points
View Plane or Screen
Viewers eye
12
RayLab
  • Features
  • Primitives sphere, cone, cylinder, plane,
    ellipsoid, box, disc etc.
  • Texture mapping
  • Transformations scaling, rotation and
    displacement
  • Constructive solid geometry
  • Advanced Effects Soft shadow, Anti-aliasing,
    Reflection, Refraction, Transparency, Glossy
    effect etc.

13
Computation intensive functions
14
Problems with RayLab
  • Doing one task at a time (dataflow)
  • Lots of parameter passing in selected routines
  • It is not designed to be ported onto hardware

15
Modifications
  • Threads are inserted so that multiple rays can be
    calculated in parallel.
  • All the objects are collected initially and sent
    to local memory of hardware when it is
    initialized.
  • Sending multiple sets of data, so that hardware
    is called less number of times.

16
Hardware Accelerator
  • Hardware accelerator is a dedicated device that
    encompasses all the time consuming routines.
  • It is an FPGA board with PCI Interface
  • It can do multiple tasks in parallel by having
    multiple execution units running independently
  • It has its own local memory to store the inputs
    and generated results

17
Memory Wrapper
  • It is developed for ADM-XRC card
  • Provides interface between host machines data
    bus and FPGAs local memory
  • Support both programmed I/O and DMA transfer
  • Provides interface between execution unit on FPGA
    and its local memory.
  • It supports four memory banks for simultaneous
    use
  • Support two different modes user mode wrapper
    mode

18
Performance Estimation
  • Host Pentium 4
  • Hardware accelerator FPGA board with 4 execution
    units, 100MHz clock
  • speed, 120Mbps (120/8MBps) transfer speed.
  • Input scene 27 objects
  • Time taken by selected routines on host machine
    2945 seconds
  • Initial data transfer time (20027)/(120/8)
    10-6 360 10-6 seconds
  • Input generated result transfer time
    28/(120/8)10-6 36/(120/8)10-6 4.27
    10-6seconds
  • Total number of cycles needed by each computation
    units 300 (27 / 4)
  • Hardware is called 26893845.
  • Total time needed on hardware accelerator (360
    26893845(1.87 2.4 20.25)) 10-6
  • 659.44 seconds
  • Gain (2945-659.44)/2945 77.61

19
Accelerator
  • It incorporates all the computation intensive
    routines of the ray tracer
  • There are multiple copies of execution module,
    which are capable of running simultaneously and
    independently
  • Each module has its own memory bank
  • Each module can run different set of routines and
    can operate on different set of data
  • It has two main components controller and
    sub-module

20
Accelerator
BoundCheck
TransfornLine
VectorOperation
IntersectSphere
PostTransform
Controller
Memory
Registers
FPU 1
FPU 2
FPU 3
21
Controller
  • Initialize the memory to be used by modules
  • Generate control signals for various modules
  • Maintain mutual exclusion for FPUs and memory
  • Generate final results in local memory

22
Sub-modules
  • Each sub-module corresponds to a routine in the
    ray tracer
  • Take the input from the specified location in
    local memory
  • Perform computation on the input data
  • Generate output at specified location
  • Return control to controller

23
Work done in last semester
  • RayLab is modified so that multiple tasks can be
    generated and it can use hardware efficiently
  • Accelerator is designed and all the internal
    modules are identified
  • The internal modules identified are
  • Bound check
  • Generation of transformed ray
  • Processing of intermediate data
  • Generation of final results
  • Written a sub-module for the routine BoundCheck

24
Current Status
  • One more module VectorOperation is added to the
    design
  • Written code for VectorOperation and
    IntersectSphere
  • Changed the way of communication between
    controller and submodules
  • Three floating point units have been added to the
    Accelerator
  • Integrating BoundCheck, IntersectSphere,
    Controller, Registers and memory.

25
Work to be done
  • Implementation of TransformLine PostTransform
    submodules
  • Make this overall system run error free
  • Synthesize the Accelerator
  • Minimize the number of clock cycles needed for
    overall task
  • Minimize the number of calls made to hardware,
    i.e. perform more work in one call
  • Minimize the data transfer between hardware and
    software

26
  • Thanks
Write a Comment
User Comments (0)
About PowerShow.com