Title: PROJECT TEAM :
1LOW POWER RECONFIGURABLE CORE
FOR 3D GRAPHICS SHADING AND TEXTURE MAPPING
PROJECT TEAM
JEONGSEON EUH
ARUNACHALAM RAMANATHAN JEEVAN CHITTAM KRISHNA
PRASAD NIRUPAMA RAMASWAMY
2Integration Review In this review .
- 1. Final Implementation
- 2. Interfacing with ASOC Interconnect
- 3. Simulation results of the Integrated Module
- 4. Final Estimates
- Power
- Area
- Timing
- 5. Summary of the Project
- 6. Future work
- 7. What we learnt in this course ?
- 8. Due Acknowledgements
3Final Implementation
Memory Controller
ASOC MEM 1
ASOC MEM 0
32
32
Decompression unit
Decompression Unit
Point Filtering
I(r,g,b)
U10.6, V10.6, Level, Config
Bilinear Filtering
I(r,g,b)
Coordinate Generator
Addr. Genr.
Wt. Genr.
RGB adder
Trilinear MAC
I(r,g,b)
Bilinear Filtering
Addr. Genr.
Coordinate Generator
Wt. Genr.
RGB adder
I(r,g,b)
Trilinear Filtering
4Simulated Waveforms for Bilinear Filtering
Input
Output
Output
Inputs u 90 52 v 90 40 Level 30
2 , Intensity 230 70 100 16 Height
Width 100 Outputs Ir70 83 , Ig 139 ,
Ib 22 Throughput ¼ Latency 8 clock
cycles ( 1 Coordinate Generation ,3 - Addr.
Genr and Weight Genr 3 - RGB Multiplier ,
1 - Accumulation )
5 AREA, SPEED AND POWER ESTIMATES
AREA and TIMING
Logic Elements
Flip Flops
Quantity
Frequency (MHz)
Functional Block
Coordinate Generator
169
32
2
248.4
Point Filtering
1
166
220
94
308
Weight Generator
168
2
146.3
357
Address Generator
2
154.37
604
Bilinear Filtering
2
649
124.5
1117
Trilinear MAC 1
231 144
250
Trilinear Filtering
1
2465
124.5
1442
Point Filtering - (166 12)
1392 gates Bilinear Filtering -
(1117 12 174 ) 13578
gates Trilinear Filtering - ( 1117212 174
2312) 29754 gates Total number of Gates
(169 166 11172 231)12 174 4
34296
1 Altera Apex LE 12 Gates (Approx).
6Power Dissipation
Courtesy Apex Power Calculator
(www.altera.com/products/devices/apex/apx-power_ca
lc.html )
Performance Maximum possible clock frequency
124.5 Mhz Apex 20K Altera Device Toggle Rate
12.5 Core Voltage 1.8 V (Apex20K
Device) Technology 0.25 micron
7Summary
- RTL Implementation of the Texture Mapping core
employing three - filtering algorithms is Completed.
- All the modules are implemented separately and
verified for their functionality. - Integrated core is also implemented and
verified. - Behavioral model of the Memory Controller and
the De-compression unit completed. - Compression Algorithm used Color Compression
Coding (4-bit/pixel implemented). - Specifications
- Total Area of the core 34,296 Logic gates.
- Worst case power dissipation 397.09 mW
(Trilinear Filtering) - Frequency of Operation 124.5 MHz.
- For an Altera APEX 20k Device
- Design tools used
- Cadence Verilog HDL (Verilog XL)
- Altera QuartusTM V2000.09
- DAI Signal Scan Waveforms
- APEX Power Calculator (www.altera.com/products/dev
ices/apex/apx-power_calc.html )
8Further Improvements on this work
- Implementation of Texture Cache for faster
Operation. - Incorporating a scheme for Loading of texture
data from either the External Memory or Internal
Memory. - Dynamic Selection of Compression methodology
based on the quality and performance constraints. - Devising a methodology to relinquish the unused
memory blocks to other heterogeneous cores.
9What have we learnt ?
- Application of VLSI and Architecture principles
in real-life. - How to go about doing a project in an orderly
fashion . - RTL and Behavioral modeling of Circuits and
Systems using Verilog HDL. - Improved Presentation Skills and Team Work .
10INTEGRATION REVIEW SHADING CORE
- Normal Vector Interpolation (Cordic in 3D )
- - Architecture
- - Implementation
- 2. Shading Core Integration with ASOC
- Updated Estimates
- - Area
- - Speed
- - Power
- 4. Future Work
11Shading Core Basic Blocks
N
N
Cordic Algorithm (Normal Vector Interpolation)
(Phong Only)
P
G
MUX
G / P
(Gouraud Only)
Intensity Interpolation
Intensity Computation
Ia, Ili, S Ka, Kd, Ks
G
P
I (R,G,B) (To Blending Engine)
MUX
G / P
12Normal Vector Interpolator
CORDIC VECTORING MODE
CORDIC ROTATION MODE
Courtesy Jeongseon Euh
13Cordic Architecture
Rotation Mode
Vectoring Mode
Courtesy www.andraka.com
14ASOC INTERFACE RECONFIGURATION
SHADING CORE
ASOC INTERCONNECT
P_Clk1
G_Clk1
500 MHz
4
CLOCK
G_Clk1
Clock Divider
4
Intensity Interpol. (Gouraud)
Normal Interpol. Cordic (Phong)
G_Clk2
Input Constraints Dist. b/w Viewer
Object Number of Light Sources Image
Quality Object Speed Power
4
P_Clk1
a
LCU
8
ReConfig. / Global Control
8
Reconfig Control
READ PORT
Lighting Computation
ASOC Read/Write Control
Reconfig. Data
32
WRITE PORT
G_Clk1
G_Clk2
32
TAG LUT
P_Clk1
G_Clk2 G_Clk1 / N
a Accuracy, G/P, No. of L
N Arithmetic Accuracy
15UPDATED AREA ESTIMATES (GATE COUNT)
(G P)
(Gouraud)
(Phong)
Total Gates Lighting
Intensity Interpolation Normal
Interpolation (3D Cordic)
Lighting ( 63,000 28,416 155,490 )
276,066 Gates
P- LUT 200K Bits
P
G P
Intensity Interpolation 28, 860 gates ( Edge
Scan )
Normal Interpolation (Cordic) ( 25,704
230,400) 256,104
Logic
LUT 300K Bits
Total 561,030 Gates Logic 31.2 of
Area Memory 68.8 of Area
1 Logic Element 12 Gates ( Apex 20K Altera
Device)
1 LE (4 I/P LUT) 16 Bits
16UPDATED SPEED ESTIMATES
Gouraud
Phong
Light 80 MHz (Max.) Intensity IPOL. - 99.6 MHz
(Max.) ? Max. Pixel Fill Rate 99.6 Mpixels/sec
Light 80 MHz (Max.) Normal IPOL. (Cordic)
55.5 MHz (Max.)
Lighting
Lighting
( 80 MHz 80 / N MHz Operation)
( 55.5 MHz 55.5 / N MHz Operation)
Output BW Depends on No.Pixels/Triangle
17UPDATED POWER ESTIMATES
Core Voltage 1.8 V (Apex20K Device) Technology
0.25 micron Toggle Rate 12.5
_at_ 80 8 MHz
_at_ 55.5 5.55 MHz
Assuming Max. Accuracy of 10 Cycles
18SUMMARY
- RTL Implementation For Gouraud and Phong Shading
has been completed - 2. Complete Core has been tested for
functionality - 3. TOTAL ESTIMATES
- AREA - 561,030 Gates
- MAX. POWER 535.5mW _at_55.5MHz (Phong) -gt
(Excluding I/Os and Drivers)
FUTURE WORK
- Parallelizing Distributed/Serial Arithmetic gt
Reconfigurable (Speed /Power) - 2. Unrolled/Pipelined CORDIC Implementation gt
Reconfigurable (Speed /Power) - 3. Power Estimates/Savings gt Based on Dynamic
Reconfiguration - 4. Control Block for Algorithm / Architecture
Reconfiguration gt Based on Power / Image Quality
Constraints
19 Acknowledgements
Prof. Burleson ( Motivator) Jeongseon Euh
(Initiator) Jian Liang Andy (aSOC) Prashanth
Jain (TA)
THANK YOU
20Interfacing ASOC Interconnect
Vdd
T E X T U R E M A P P I N G C O R E
CLK
Status or Control Info.
Control and Config. Data Filtering
select, Level , Size of texture, Reset, Config ,
Memory Select
A S O C I N T E R C O N N E C T
8
L C U
Point Filtering
1
Read
Data In
Tag
5
Data
32
Bilinear Filtering
1
Valid bit
1
Write
Bilinear Filtering
Tag
5
Data Out
32
Data
Trilinear Filtering
1
Success
21Floor Plan
External Memory
32 bits
Memory 2
Memory Controller
Memory 1
32
32
23
24 bits
Control
Decompressed Data
Address
Texture mapping Core
U,V,Level
Config. Data
Geometric Engine
? Proc