Understanding the TigerSHARC ALU pipeline - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

Understanding the TigerSHARC ALU pipeline

Description:

Understanding the TigerSHARC ALU pipeline Determining the speed of one stage of IIR filter Part 2 Understanding the pipeline – PowerPoint PPT presentation

Number of Views:62
Avg rating:3.0/5.0
Slides: 33
Provided by: MichaelR223
Category:

less

Transcript and Presenter's Notes

Title: Understanding the TigerSHARC ALU pipeline


1
Understanding the TigerSHARC ALU pipeline
  • Determining the speed of one stage of IIR filter
    Part 2Understanding the pipeline

2
Understanding the TigerSHARC ALU pipeline
  • TigerSHARC has many pipelines
  • If these pipelines stall then the processor
    speed goes down
  • Need to understand how the ALU pipeline works
  • Learn to use the pipeline viewer
  • Understanding what the pipeline viewer tells in
    detail
  • Avoiding having to use the pipeline viewer
  • Improving code efficency
  • Excel and Project (Gantt charts) are useful tool

3
Register File and COMPUTE Units
4
Simple ExampleIIR -- Biquad
S0 S1 S2
  • For (Stages 0 to 3) Do
  • S0 Xin H5 S2 H3 S1 H4
  • Yout S0 H0 S1 H1 S2 H2
  • S2 S1
  • S1 S0

5
Code return float when using XR8 register NOTE
NOT XFR8
6
Step 2 Using C code as comments set up the
coefficients
XFR0 0.0 Does not exist XR0 0.0 DOES
EXIST Bit-patternsrequireintegerregisters Lea
ve what youwanted to dobehind ascomments
7
Expect to take8 cycles to execute
8
PIPELINE STAGESSee page 8-34 of Processor manual
  • 10 pipeline stages, but may be completely
    desynchronized (happen semi-independently)
  • Instruction fetch -- F1, F2, F3 and F4
  • Integer ALU PreDecode, Decode, Integer, Access
  • Compute Block EX1 and EX2

9
Pipeline Viewer Result
XR0 1.0 enters PD stage _at_ 39025, enters
E2 stage at cycle 39830 is
stored into XR0 at cycle 39831 -- 7 cycles
execution time
10
Pipeline Viewer Result
XR6 5.5 enters PD stage at cycle 39032
enters E2 stage at cycle 39837
is stored into XR6 at cycle 39838
-- 7 cycles execution time Each instruction
takes 7 cycles but one new result each
cycle Result once pipeline filled 8 cycles 8
register transfer operations
11
Doing filter operations generates different
results XR8 XR6 enters PD at
39833, enters EX2 at 39838, stored 39839 7
cyclesXFR23 R9 R4 enters PD at 39834,
enters EX2 at 39839, stored 39840 7 cyclesXFR0
R0 R23 enters PD at 39835, enters EX2 at
39841, stored 39842 8 cycles WHY?
FIND OUT WITH MOUSE CLICK ON S MARKER THEN CONTROL
12
Instruction 0x17e XFR8 R8 R23 is STALLED
(waiting) for 0x17d to complete XFR23 R8 R4
Bubble B means that the pipeline is doing
nothingMeaning that the instruction shown is
place holder (garbage)
13
Information on Window Event Icons
14
Result of Analysis
  • Cant use Float result immediately after
    calculation
  • Writing XFR23 R8 R4 XFR8 R8 R23
    // MUST WAIT FOR XFR23
    // calculation to be completedIs the
    same as coding XFR23 R8 R4 NOP ?
    Note DOUBLE -- extra cycle because of stall
    XFR8 R8 R23
  • Proof write the code with the stalls shown in
    it
  • Writing this way means we dont have to use the
    pipeline viewer all the time
  • Pipeline viewer is only available with (slow)
    simulator
  • define SHOW_ALU_STALL nop

15
Code withstalls shown
  • 8 code lines
  • 5 expected stalls
  • Expect 13 cyclesto completeif theory is correct

16
Analysis approach IS correct
17
Process for coding for improved speed code
re-organization
  • Make a copy of the code so can test iirASM( ) and
    iirASM_Optimized( ) to make sure get correct
    result
  • Make a table of code showing ALU resource usage
    (paper, EXCEL, Project (Gantt chart) )
  • Identify data dependencies
  • Make all temp operations use different register
  • Move instructions forward to fill delay slots,
    BUT dont break data dependencies

18
Copy and paste to makeIIRASM_Optimized( )
19
Need to re-order instructionsto fill delay slots
with useful instructions
  • After refactoring code to fill delay slots, must
    run tests to ensure that still have the correct
    result
  • Change and check
  • NOT EASY
  • MUST HAVE APLAN
  • I USE EXCEL

20
Show resource usage and data dependencies
21
Change all temporary registers to use different
register namesThen check code produces correct
answer
22
Move instructions forward, without breaking data
dependencies
What appears possible! DO one thing at a time
and then check that code still works
23
Check that code still operates1 cycle saved
24
Move next multiplication up. NOTE certain stalls
remain, although reason for STALL changes
25
Move up the R10 and R9 assignment operations --
check
4 cycle improvement?
26
CHECK THE PIPELINE AFTER TESTING
27
Are there still more improvements possible (I can
see 4 more moves)
28
Problems with approach
  • Identifying all the data dependencies
  • Keep track of how the data dependencies change as
    you move the code around
  • Handling all of this automatically
  • I started the following design tool as something
    that might work, but it actually turned out very
    useful.M. R. Smith and J. Miller,
    "Microprocessor Scheduling -- the irony of using
    Microsoft Project", "Dont say CANT do it - Say
    Gantt it! The irony of organizing
    microprocessors with a big business tool"
    Circuit Cellar magazine, Vol. 184, pp 26 - 35,
    November 2005.

29
Using Microsoft Project Step 1
30
Add dependencies and resource usage then
activate level
31
Microsoft Project as a microprocessor design tool
  • Will look at this in more detail when we start
    using memory operations to fill the coefficient
    and state arrays

32
Understanding the TigerSHARC ALU pipeline
  • TigerSHARC has many pipelines
  • If these pipelines stall then the processor
    speed goes down
  • Need to understand how the ALU pipeline works
  • Learn to use the pipeline viewer
  • Understanding what the pipeline viewer tells in
    detail
  • Avoiding having to use the pipeline viewer
  • Improving code efficiency
  • Excel and Project (Gantt charts) are useful tool
Write a Comment
User Comments (0)
About PowerShow.com