Title: Thermal-aware and Low Power Optimizations for VLSI Synthesis
1Thermal-aware and Low Power Optimizations for
VLSI Synthesis
- Min Ni (m-ni_at_northwestern.edu)
- PhD Advisor Prof. Seda Ogrenci Memik
- Department of EECS, Northwestern University
Leakage Power Aware Clock Skew Scheduling
Thermal-induced Leakage Power Optimization
Introduction
Problem Formulation
- Synthesis targeting reconfigurable logic (e.g.
FPGA) faces the challenge of creating designs
that comply with the resource storage capacity
of target device - Synthesis tools optimize latency and/or
throughput - Often push the utilization of the device to
its capacity - May lead to infeasible designs
- With increasing popularity of portable devices
demand for streaming multimedia applications are
growing - Computationally intensive
- Functional pipelining is shown to be
effective - Early estimates of resource requirements very
useful for faster design closure and design space
exploration
- Given an unscheduled streaming DFG, G (V, E)
and a set of Resource Constraints - Find The total number of registers needed at the
output of all Functional Units
- Estimation technique evaluated on a set of
industrial multimedia applications - Applications include video and image compression
and filtering algorithms, as well as applications
such as license plate recognition - Different resource constraints chosen for each
application
Iteration Interval Estimation
- Based on the resource constraints a lower bound
on the iteration interval is estimated
Embedded Computing
Mobile Devices
Networking
- Based on the ASAP and ALAP schedules minimum
queue size of each node is estimated
Determine Min. Queue Sizes of Edges
Refinement of Queue Sizes of Edges
Planning Clock Period during Register Binding
Self-heating Aware Optimal Wire Sizing
Application RC Set 1 RC Set 1 RC Set 1 RC Set 2 RC Set 2 RC Set 2
Application
dctCol 85 103 152 47.6 36 44 22.2
dctRow 95 111 168 51.4 41 49 19.5
hpf_med_cc 157 404 347 -14.1 76 91 19.7
lpf_gc_rgb 221 697 508 -27.1 103 117 13.6
lpr 67 150 110 -26.7 58 67 15.5
open 30 71 47 -33.8 44 34 -22.7
Quant 10 13 13 0.0 13 13 0.0
RsvpLPR 67 150 110 -26.7 58 67 15.5
Average 92.7 212.4 181.9 -14.4 53.6 60.3 12.4
- Based on the likelihood estimation of the facts
that - Source may produce data before ALAP time
- Sink may consume data after ASAP time
- Probabilistic Push-and-Pull approach
- Source node estimated to be pulled up by ?i
- Sink node estimated to be pushed down by ?j
- Expected values of ?i and ?j are calculated based
upon - Contention for resources
- More critical nodes than current node
- Estimated Iteration Interval
- Probabilistic push-and-pull method proposed to
estimate data queuing cost for streaming
accelerators - Estimates are within -14.4 and 12.4
Queue Estimates of FUs using Resource Constraint
Correction Factor
- Multiple nodes of sDFG mapped to a single FU
enable sharing of the output queues - With fewer resources more nodes are mapped to FUs
and queue sizes decrease - With more contention for resources outputs of
operations have to be stored longer before it
gets consumed
- Develop integrated framework to estimate hardware
cost for streaming accelerators implemented on
reconfigurable fabric from unscheduled data flow
graphs