Blackfin Compute Unit - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Blackfin Compute Unit

Description:

A comparison of DSP Architectures BlackFin ADSP-BFXXX Compute Unit Based on a ENEL619.23 white paper prepared by Darrell Anklovitch – PowerPoint PPT presentation

Number of Views:106
Avg rating:3.0/5.0
Slides: 21
Provided by: DarrellAn2
Category:
Tags: blackfin | compute | unit

less

Transcript and Presenter's Notes

Title: Blackfin Compute Unit


1
A comparison of DSP Architectures BlackFin
ADSP-BFXXX Compute Unit
  • Based on a ENEL619.23 white paperprepared by
    Darrell Anklovitch

2
Overview
  • Architecture Overview
  • Register Map
  • ALU features and sample instructions
  • Multiplier features and sample instructions
  • Shifter features and sample instructions

3
References
  • ADSP-BF535 Blackfin Processor Hardware Reference,
    Rev 2, April 2004, Analog Devices. Section 2
  • Blackfin Processor Instruction Set Reference, Rev
    2, May 2003, Analog Devices. Sections 8 10,
    14 15
  • A number of the figures in this presentation are
    based on figures found in the ADSP-BF535 Blackfin
    Processor Hardware Reference.

4
ADSP-2106x Core Architecture
5
Register File and COMPUTE Units
  • Key issues
  • 5 data paths FROM COMPUTE units
  • 5 data paths TO COMPUTE units
  • Highly parallel operations UNDER THE RIGHT
    CONDITIONS

6
BF533 Memory Accesses
Under the right conditions -- 4 memory accesses
at same time 64 bit Instruction Fetch, 2x32 bit
Data Loads, 32 bit Data Store PLUS up to 2 ALU(32
bit) and 2 MAC(16 bit) operations at the same
time PLUS background DMA activity
7
Compute Unit Architecture
2 Multipliers
Register File
1 set of Video ALUs
1 Shifter
2 ALUs
8
Register File
  • DATA REGISTER SYNTAX
  • R0, R1 etc refer to 32 bit registers
  • R0.L refers to the low 16 bits of the R0 32 bit
    reg
  • R0.H refers to the high 16 bits of the R0
    register
  • ACCUMULATOR SYNTAX
  • A0.L gt low 16 bits
  • A0.H gt next 16 bits
  • A0.W gt least significant 32 bit word
  • A0.X gt MS 8 bit extension

SHARC 16 32-bit data registers, integer and
floatThere is a pair of SHARC accumulator
registers too
9
ALU Data Flow
2 x 32 bit paths to dual Multiplier/ALU units
2 x 32 bit paths back to register file
10
Sample instructions
BlackfinR0 R1 R2R0.L R1.L R2.H R0 R1 - R2 Means R0.L R1.L R2.Lin parallel withR0.H R1.H R2.H SHARCR0 R1 R2 Closest R0 R1 R2, R4 R1 R2 68KMOVE.L R2, R0ADD.L R1, R0 MOVE.W R2, R0ADD.W R1, R0MOVE.L R2, R0ASR.L 16, R0MOVE.L R1, R3ASR.L 16, R3ADD.W R3, R0ASL.L 16, R0MOVE.W R2, R0ADD.W R1, R0
11
ALU Features
Single 16 bit OPS
31
Rm
Rp
Rn
Dual 16 bit Cross
Single 32 bit OPS
31
Rm
Rp
Rn
12
ALU Sample Instructions
Single 16 bit ops
Dual 16 bit ops
Single 32 bit ops
Does not work in parallel
Must have this option
Operator order is important must come before -
  • A B registers must stay on the same side of the
    for both
  • Instructions
  • For dual and quad 16 bit operations the (CO)
    option causes the
  • destination registers to cross

13
Multiply Data Flow
2 x 32 bit paths to dual Multiplier/ALU units
Multiplier share the same operand/result buses as
the ALU
2 x 40 bit accumulator
2 x 32 bit paths back to register file
14
Multiply Features
  • Multiplies are signed fractional by default
  • Signed fractional multiply result is
    automatically left
  • shifted 1 bit.
  • Signed fractional multiply ! signed integer
    multiply
  • Rounding available on fractional number
    multiplies and
  • special option of integer number multiplies

15
Rounding
2 cases
Rounding adds 0x8000 to the 32 bit multiplier
result or accumulator value before extracting a
16 bit value to the destination register
16
Fractional Multiply
Fractional Multiply ! Integer Multiply
Fractional Multiply ! Integer Multiply
  • When extracting a 16 bit fractional value from an
    accumulator
  • the high 16 bits is taken
  • Where in the destination register it goes depends
    on which
  • accumulator is being extracted from

17
Integer Multiply
Fractional Multiply ! Integer Multiply
  • When extracting a 16 bit integer value from an
    accumulator
  • the low 16 bits is taken.
  • Where in the destination register the 16 bit
    value goes depends
  • on which accumulator is being extracted from

18
Multiply Sample Instructions
16 bit extraction from ACC 0
16 bit extraction from ACC 1
Multi-issue MAC Instruction Examples
32 bit extraction
A1 R1.H R2.L , A0 R1.L R2.L R3.H (A1
R1.H R2.L) , R3.L (A0 R1.L R2.L) Any
combination of .H and .L in the 2 operands is
allowed R3 (A1 R1.HR2.L), R2 (A0 R1.L
R2.L) Where destination registers must be
paired as follows R1,0, R3,2,
R5,4 and R7,6 R3.H (A1 R1.H R2.L), A0
R1.L R2.L
19
Shifter Sample Instructions
20
Parallel Instruction Examples
  • In general there are 16 and 32 bit versions of
    the arithmetic instructions
  • Most of the 32 bit instructions can be executed
    in parallel with 2 x 16 bit memory/index
    operations
  • Exceptions are DIVS, DIVQ and MULTIPLY with 32
    bit operands
  • means parallel
  • Examples
  • A1R2.LR1.L,A0R2.HR1.HR2.HWI2
    I3R3\
  • R2R2R4, R4R2--R4 I0M0R1I0
Write a Comment
User Comments (0)
About PowerShow.com