PROJECT TEAM : - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

PROJECT TEAM :

Description:

LOW POWER RECONFIGURABLE CORE. FOR 3D GRAPHICS SHADING AND TEXTURE ... Interpol. (Gouraud) Normal. Interpol. Cordic (Phong) 32. 32. 8. 8. Reconfig. Data. G_Clk1 ... – PowerPoint PPT presentation

Number of Views:49
Avg rating:3.0/5.0
Slides: 22
Provided by: aram2
Category:
Tags: project | team | interpol

less

Transcript and Presenter's Notes

Title: PROJECT TEAM :


1
LOW POWER RECONFIGURABLE CORE
FOR 3D GRAPHICS SHADING AND TEXTURE MAPPING
PROJECT TEAM
JEONGSEON EUH
ARUNACHALAM RAMANATHAN JEEVAN CHITTAM KRISHNA
PRASAD NIRUPAMA RAMASWAMY
2
Integration Review In this review .
  • 1. Final Implementation
  • 2. Interfacing with ASOC Interconnect
  • 3. Simulation results of the Integrated Module
  • 4. Final Estimates
  • Power
  • Area
  • Timing
  • 5. Summary of the Project
  • 6. Future work
  • 7. What we learnt in this course ?
  • 8. Due Acknowledgements

3
Final Implementation
Memory Controller
ASOC MEM 1
ASOC MEM 0
32
32
Decompression unit
Decompression Unit
Point Filtering
I(r,g,b)
U10.6, V10.6, Level, Config
Bilinear Filtering
I(r,g,b)
Coordinate Generator
Addr. Genr.
Wt. Genr.
RGB adder
Trilinear MAC
I(r,g,b)
Bilinear Filtering
Addr. Genr.
Coordinate Generator
Wt. Genr.
RGB adder
I(r,g,b)
Trilinear Filtering
4
Simulated Waveforms for Bilinear Filtering
Input
Output
Output
Inputs u 90 52 v 90 40 Level 30
2 , Intensity 230 70 100 16 Height
Width 100 Outputs Ir70 83 , Ig 139 ,
Ib 22 Throughput ¼ Latency 8 clock
cycles ( 1 Coordinate Generation ,3 - Addr.
Genr and Weight Genr 3 - RGB Multiplier ,
1 - Accumulation )
5
AREA, SPEED AND POWER ESTIMATES
AREA and TIMING
Logic Elements
Flip Flops
Quantity
Frequency (MHz)
Functional Block

Coordinate Generator
169
32
2
248.4

Point Filtering
1
166
220
94
308

Weight Generator
168
2
146.3
357
Address Generator
2
154.37
604
Bilinear Filtering
2
649
124.5
1117
Trilinear MAC 1
231 144
250
Trilinear Filtering
1
2465
124.5
1442
Point Filtering - (166 12)
1392 gates Bilinear Filtering -
(1117 12 174 ) 13578
gates Trilinear Filtering - ( 1117212 174
2312) 29754 gates Total number of Gates
(169 166 11172 231)12 174 4
34296
1 Altera Apex LE 12 Gates (Approx).
6
Power Dissipation
Courtesy Apex Power Calculator
(www.altera.com/products/devices/apex/apx-power_ca
lc.html )
Performance Maximum possible clock frequency
124.5 Mhz Apex 20K Altera Device Toggle Rate
12.5 Core Voltage 1.8 V (Apex20K
Device) Technology 0.25 micron
7
Summary
  • RTL Implementation of the Texture Mapping core
    employing three
  • filtering algorithms is Completed.
  • All the modules are implemented separately and
    verified for their functionality.
  • Integrated core is also implemented and
    verified.
  • Behavioral model of the Memory Controller and
    the De-compression unit completed.
  • Compression Algorithm used Color Compression
    Coding (4-bit/pixel implemented).
  • Specifications
  • Total Area of the core 34,296 Logic gates.
  • Worst case power dissipation 397.09 mW
    (Trilinear Filtering)
  • Frequency of Operation 124.5 MHz.
  • For an Altera APEX 20k Device
  • Design tools used
  • Cadence Verilog HDL (Verilog XL)
  • Altera QuartusTM V2000.09
  • DAI Signal Scan Waveforms
  • APEX Power Calculator (www.altera.com/products/dev
    ices/apex/apx-power_calc.html )

8
Further Improvements on this work
  • Implementation of Texture Cache for faster
    Operation.
  • Incorporating a scheme for Loading of texture
    data from either the External Memory or Internal
    Memory.
  • Dynamic Selection of Compression methodology
    based on the quality and performance constraints.
  • Devising a methodology to relinquish the unused
    memory blocks to other heterogeneous cores.

9
What have we learnt ?
  • Application of VLSI and Architecture principles
    in real-life.
  • How to go about doing a project in an orderly
    fashion .
  • RTL and Behavioral modeling of Circuits and
    Systems using Verilog HDL.
  • Improved Presentation Skills and Team Work .

10
INTEGRATION REVIEW SHADING CORE
  • Normal Vector Interpolation (Cordic in 3D )
  • - Architecture
  • - Implementation
  • 2. Shading Core Integration with ASOC
  • Updated Estimates
  • - Area
  • - Speed
  • - Power
  • 4. Future Work

11
Shading Core Basic Blocks
N
N
Cordic Algorithm (Normal Vector Interpolation)
(Phong Only)
P
G
MUX
G / P
(Gouraud Only)
Intensity Interpolation
Intensity Computation
Ia, Ili, S Ka, Kd, Ks
G
P
I (R,G,B) (To Blending Engine)
MUX
G / P
12
Normal Vector Interpolator
CORDIC VECTORING MODE
CORDIC ROTATION MODE
Courtesy Jeongseon Euh
13
Cordic Architecture
Rotation Mode
Vectoring Mode
Courtesy www.andraka.com
14
ASOC INTERFACE RECONFIGURATION
SHADING CORE
ASOC INTERCONNECT

P_Clk1
G_Clk1
500 MHz
4
CLOCK
G_Clk1
Clock Divider
4
Intensity Interpol. (Gouraud)
Normal Interpol. Cordic (Phong)
G_Clk2
Input Constraints Dist. b/w Viewer
Object Number of Light Sources Image
Quality Object Speed Power
4
P_Clk1
a
LCU
8
ReConfig. / Global Control
8
Reconfig Control
READ PORT
Lighting Computation
ASOC Read/Write Control
Reconfig. Data
32
WRITE PORT
G_Clk1
G_Clk2
32
TAG LUT
P_Clk1
G_Clk2 G_Clk1 / N
a Accuracy, G/P, No. of L
N Arithmetic Accuracy
15
UPDATED AREA ESTIMATES (GATE COUNT)
(G P)
(Gouraud)
(Phong)
Total Gates Lighting
Intensity Interpolation Normal
Interpolation (3D Cordic)
Lighting ( 63,000 28,416 155,490 )
276,066 Gates
P- LUT 200K Bits
P
G P
Intensity Interpolation 28, 860 gates ( Edge
Scan )
Normal Interpolation (Cordic) ( 25,704
230,400) 256,104
Logic
LUT 300K Bits
Total 561,030 Gates Logic 31.2 of
Area Memory 68.8 of Area
1 Logic Element 12 Gates ( Apex 20K Altera
Device)
1 LE (4 I/P LUT) 16 Bits
16
UPDATED SPEED ESTIMATES
Gouraud
Phong
Light 80 MHz (Max.) Intensity IPOL. - 99.6 MHz
(Max.) ? Max. Pixel Fill Rate 99.6 Mpixels/sec
Light 80 MHz (Max.) Normal IPOL. (Cordic)
55.5 MHz (Max.)
Lighting
Lighting
( 80 MHz 80 / N MHz Operation)
( 55.5 MHz 55.5 / N MHz Operation)
Output BW Depends on No.Pixels/Triangle
17
UPDATED POWER ESTIMATES
Core Voltage 1.8 V (Apex20K Device) Technology
0.25 micron Toggle Rate 12.5
_at_ 80 8 MHz
_at_ 55.5 5.55 MHz
Assuming Max. Accuracy of 10 Cycles
18
SUMMARY
  • RTL Implementation For Gouraud and Phong Shading
    has been completed
  • 2. Complete Core has been tested for
    functionality
  • 3. TOTAL ESTIMATES
  • AREA - 561,030 Gates
  • MAX. POWER 535.5mW _at_55.5MHz (Phong) -gt
    (Excluding I/Os and Drivers)

FUTURE WORK
  • Parallelizing Distributed/Serial Arithmetic gt
    Reconfigurable (Speed /Power)
  • 2. Unrolled/Pipelined CORDIC Implementation gt
    Reconfigurable (Speed /Power)
  • 3. Power Estimates/Savings gt Based on Dynamic
    Reconfiguration
  • 4. Control Block for Algorithm / Architecture
    Reconfiguration gt Based on Power / Image Quality
    Constraints

19
Acknowledgements
Prof. Burleson ( Motivator) Jeongseon Euh
(Initiator) Jian Liang Andy (aSOC) Prashanth
Jain (TA)
THANK YOU
20
Interfacing ASOC Interconnect
Vdd
T E X T U R E M A P P I N G C O R E
CLK
Status or Control Info.
Control and Config. Data Filtering
select, Level , Size of texture, Reset, Config ,
Memory Select
A S O C I N T E R C O N N E C T
8
L C U
Point Filtering
1
Read
Data In
Tag
5
Data
32
Bilinear Filtering
1
Valid bit
1
Write
Bilinear Filtering
Tag
5
Data Out
32
Data
Trilinear Filtering
1
Success
21
Floor Plan
External Memory
32 bits
Memory 2
Memory Controller
Memory 1
32
32
23
24 bits
Control
Decompressed Data
Address
Texture mapping Core
U,V,Level
Config. Data
Geometric Engine
? Proc
Write a Comment
User Comments (0)
About PowerShow.com