Adaptive System on a Chip aSoC for LowPower Signal Processing PowerPoint PPT Presentation

presentation player overlay
1 / 39
About This Presentation
Transcript and Presenter's Notes

Title: Adaptive System on a Chip aSoC for LowPower Signal Processing


1
Adaptive System on a Chip (aSoC) for Low-Power
Signal Processing
  • Andrew Laffely, Jian Liang, Prashant Jain, Ning
    Weng,
  • Wayne Burleson, Russell Tessier
  • Department of Electrical and Computer Engineering
  • University of Massachusetts, Amherst
  • alaffely, jliang, pjain, nweng, burleson,
    tessier _at_ecs.umass.edu

This material is based upon work supported by the
National Science Foundation under Grant No.
9988238. Any opinions, findings, and conclusions
or recommendations expressed in this material are
those of the author(s) and do not necessarily
reflect the views of the National Science
Foundation.
2
Overview
  • Motivation
  • Video Processing
  • Architecture
  • Dynamic Power Management
  • Core, Interconnect, and Clock

3
Problem
  • Wireless video processing requires
  • High throughput
  • Low Power
  • Flexible

4
System on a Chip Solutions
  • Take advantage of parallelism
  • Possible improved performance
  • Allow use and reuse of existing integrated
    components
  • If
  • The application can be partitioned
  • The appropriate architecture is used

5
Proposed Architecture aSoC
  • High throughput
  • Heterogeneous processor elements
  • Use the right tool for the job
  • Fast and predictable interconnect
  • Flexible
  • Runtime reconfiguration of cores and interconnect
  • Power consumption
  • Implement power saving features in both cores and
    interconnect
  • Use reconfiguration to dynamically control power
    consumption

6
aSoC adaptive System on a Chip
  • Tiled SoC architecture

7
aSoC adaptive System on a Chip
  • Tiled SoC architecture
  • Supports the use of independently developed
    heterogeneous cores
  • Pick and place cores which best perform the given
    application
  • Increase performance
  • Save power
  • Cores may be any number of tiles in size

8
aSoC adaptive System on a Chip
  • Tiled SoC architecture
  • Supports the use of independently developed
    heterogeneous cores
  • Connected with an interconnect mesh
  • Restricted to near neighbor communications
  • Creates pipeline
  • Decreases cycle time

9
aSoC adaptive System on a Chip
  • Tiled SoC architecture
  • Supports the use of independently developed
    heterogeneous cores
  • Connected with a fixed interconnect mesh
  • Using a communication interface (CI) to manage
    data
  • Network port (Coreport) for each core
  • Each CI uses a memory and FSM to repetitively
    process a predefined schedule of communications
  • Crossbar

10
Stream Control
  • Instruction memory
  • Holds the predetermined schedule of
    communications
  • PC
  • Selects and synchronizes the communications
  • Decoder
  • Sets crossbar
  • Controller
  • Sets PC
  • Interprets incoming configuration commands
  • Crossbar
  • Any input to any set of outputs

Outputs
Inputs
Core
Core
North
North
South
South
East
East
West
West
Local Config.
Decoder/Controller
PC
Instruction Memory
11
Example Communication
  • Stream A-D
  • A given application requires periodic
    communications from Core A to Core C
  • aSoC uses a prescheduled communication STREAM
  • Core A places the data in a dedicated STREAM
    between the two tiles
  • Core C pulls the data from that STREAM
  • The tile to tile communication uses 3 cycles

12
Example Stream
1
Core to East
13
Example Stream
  • Stream A-D

2
West to East
14
Example Stream
West to Core
3
15
Example Stream
  • Stream A-D

1
Core to East
Loop Back
2
West to East
West to Core
3
16
Static Scheduled Communications
  • Creates system scalability by eliminating
    network congestion
  • Many interconnect segments managed with time
    division multiplexing
  • lots of Bandwidth
  • Improves SoC performance by up to factor of 8

17
Power Consumption?
  • Provide reconfiguration methods for cores and CI
  • Develop programmable clocking systems at each tile

18
Power Aware Core
  • Custom motion estimation core
  • Choose search method
  • Full search
  • 960-600mW (bit width and pel sub-sampling)
  • Spiral search
  • 76mW
  • Three step search
  • 25mW
  • Data taken with SynopsysTM Power Compiler at the
    RTL level

19
aSoC Support
  • Multiple streams in and out through dedicated
    coreports
  • Easy to manage on both sides of the port
  • Schedule configuration streams in with the data
  • Stream A Input Frame
  • Stream B Configuration (Choose search mode and
    size)
  • Stream C Motion Vectors

Motion Estimation Core
Coreports
in1
in2
out2
out1
Stream A
Stream C
Stream B
20
Reconfigurable Interconnect
  • P-frame
  • I-frame


S
DCT
-
Input Frame
ME
MC
DCT
Input Frame
21
aSoC Support
Motion Estimation Compensation
DCT
  • Lumped ME, MC and Summation into one double core

22
aSoC Support P-Frame
Motion Estimation Compensation
DCT
Input Frame (Stream A)
Difference Frame (Stream B)
23
aSoC Support Schedule Change
Motion Estimation Compensation
DCT
Input Frame (Stream A)
Difference Frame (Stream B)
Configuration Streams (C D)
24
aSoC Support Schedule Change
Motion Estimation Compensation
DCT
Input Frame (Stream A)
Difference Frame (Stream B)
Schedule 1
PC
Schedule 2
Configuration (Streams C)
25
aSoC Support Schedule Change
Motion Estimation Compensation
DCT
Input Frame (Stream A)
Difference Frame (Stream B)
Schedule 1
PC
Schedule 2
Configuration (Streams C)
26
aSoC Support Schedule Change
Motion Estimation Compensation
DCT
Input Frame (Stream A)
Schedule 1
PC
Schedule 2
Configuration (Streams D)
27
aSoC Support Schedule Change
Motion Estimation Compensation
DCT
Input Frame (Stream A)
Schedule 1
PC
Schedule 2
Configuration (Streams D)
28
aSoC Support I-Frame
OFF
Motion Estimation Compensation
DCT
Input Frame (Stream A)
29
Operating Frequency?
  • Interconnect synchronized
  • H-tree clock distribution
  • Core frequencies depend on critical path
  • Tile provides clock reference
  • Coreport provides asynchronous boundary
  • Dynamic core configuration requires dynamic clock
    configuration
  • aSoC clock reference provides multiples of
    interconnect clock ( 4x, 2x, 1x, 0.5x, 0.25x, )
  • Configured through the tile controller

30
Mixed vs. Fixed Core Frequencies
  • Cores not designed with clock gating
  • Core power from Synopsys RTL simulation
  • Interconnect from SPICE
  • Assumes 10 cycle schedule, 4 pixels/word

31
Current Density and Clocking
  • Red fixed worst case clocking
  • Short spikes of high current
  • Green optimal independent clocking
  • Slow and low
  • Optimal clocking eliminates current spikes
    (improved battery life)

ME Full Search ME Spiral ME Three Step
Search DCT
Current
Time
Deadline
Process Start
32
Configuration Overhead
  • Configuration adds up to 2 streams per tile
  • Only 2 required for data
  • Total BW 5xTxN
  • 5 streams/(cycle,tile)
  • T tiles
  • N cycles in schedule
  • Single tile can support up to 50 different
    streams in 10 cycle schedule

DCT
Input Frame (Stream B)
Transform Frame (Stream D)
Configuration Streams
33
Configuration Power Overhead
  • Configuration streams used infrequently
  • Once/Macro block or Once/Frame
  • Architecture disables unused streams
  • Data valid bit already used for flow control
  • Only 4-9 of interconnect power is due to
    configuration streams

34
Conclusion
  • aSoC supports dynamic power management with
    Reconfiguration
  • Cores
  • Interconnect
  • Clocks
  • Low configuration overhead in both
  • Communication Bandwidth
  • Power

35
Future Work
  • Add reconfigurable voltage supplies at each tile
  • Finish test chip
  • Import larger applications

36
Questions
37
aSoC adaptive System on a Chip
Tile
Motion Estimation and Compensation
Cores
Interconnect
Interface
38
Example Stream
  • Stream A-D

39
Partitioning
  • Automated partitioning a non trivial problem
  • For small signal processing systems user defined
    partitioning may be possible
  • Key Perfectly partitioning the system may not be
    possible
  • How can the SoC mitigate the penalty?
Write a Comment
User Comments (0)
About PowerShow.com