A Configurable Simulator for OOO Speculative Execution - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

A Configurable Simulator for OOO Speculative Execution

Description:

by Mustafa Imran Ali. Simulation Methodology ... by Mustafa Imran Ali. Architectural Assumptions. Only load misses supported. ... – PowerPoint PPT presentation

Number of Views:68
Avg rating:3.0/5.0
Slides: 19
Provided by: MustafaI7
Category:

less

Transcript and Presenter's Notes

Title: A Configurable Simulator for OOO Speculative Execution


1
A Configurable Simulator for OOO Speculative
Execution
  • Design Implementation

By Mustafa Imran Ali ID230203
2
Architecture Modeled
  • Fetch logic
  • Trace driven execution. Branches outcome
    explicitly specified.
  • Issue Logic
  • Issue width configurable
  • Functional Units Reservations Stations (RS)
  • RS count configurable
  • Execution Units modeled after MIPS R4000 Pipeline
    (Hennessy Peterson Computer Architecture 3rd
    Ed.)
  • No. of pipeline stages configurable
  • Common Data Buses
  • No. of CDBs configurable
  • ROB and commit logic
  • ROB size and commit capacity configurable

3
Simulation Methodology
  • A program trace file written in comma separated
    variable (CSV) format
  • A configuration file to specify values of
    configurable parameters
  • Trace file and configuration file input to the
    simulator

4
Architectural Assumptions
  • Only load misses supported. Stores are committed
    in a single cycle
  • Stores use a direct bus to transfer the
    calculated Effective Address into the ROB
  • Branch outcomes are written to ROB using the CDB
  • Branch mispredict is handled when the branch
    instruction reaches the Head of ROB

5
Architectural Assumptions (cont.)
  • Dynamic memory disambiguation implemented by
    using a Store EA cache
  • A load is only allowed to proceed if there are no
    pending Stores with the same effective address
  • Reservations Stations issue the first ready
    instruction detected
  • Not necessarily the oldest Instruction

6
Architectural Assumptions (cont.)
  • The number of CDBs available are arbitrated
  • When a request for CDB arrives, the following
    priority order is used to grant the requests
  • Branch FU
  • Div FU
  • LD/ST
  • MULT FU
  • FPADD FU
  • INT ALU FU

7
List of Configurable Parameters
  • ISSUE SIZE
  • The maximum number of instructions examined for
    parallel issue
  • COMMIT SIZE
  • The maximum number of instructions examined in
    ROB for commit
  • ROB SIZE
  • The number of entries in Reorder Buffer
  • NUM CDB
  • Number of Common Data Buses
  • LSQ SIZE
  • Number of entries in load store buffer
  • STORE CACHE SIZE
  • Number of entries in store EA lookup table

8
List of Configurable Parameters
  • NUMRSBU
  • NUMRSINTALU
  • NUMRSMULT
  • MULTSTAGES
  • NUMRSDIV

9
List of Configurable Parameters
  • DIVCYCLES
  • NUMRSFPADD
  • FPADDSTAGES
  • MISSPROB
  • MPPROB

10
Simulator Structure
  • main()
  • readtracefile()
  • readconfigfile()
  • while(NOT EXIT)
  • commit()
  • ROB_update()
  • RS_update()
  • CDB_Arbiter()
  • writeback()
  • execute()
  • issue()
  • fetch()
  • printStatistics()

11
Block Diagram
Issue Unit
Trace
INT ALU RS
BR UNIT RS
LSQ
DIV UNIT RS
MULT UNIT RS
ROB
Arbiter
CDB
RF
12
Metrics Measured
  • Cycles to Complete
  • Issue Stall Cycles
  • Cycles when no instructions can be issued to RS
  • FU utilizations (for each FU)
  • No. of FU type Instructions / Total Cycles
  • CDB utilizations (for each CDB)
  • No. broadcasts / Total Cycles
  • Cycles Per Instruction

13
Metrics Measured (cont.)
  • Frequency of Various Issue Count over all
    execution cycles
  • Frequency of Various Commit Count over all
    execution cycles
  • RS occupancy Frequency over all cycles
  • ROB occupancy Frequency over all cycles

14
Simulator Design
  • Coded in C
  • Compiled using MS VC 6.0

15
Execution Demonstration
Registers State Initializations REGS1.valid1 RE
GS2.valid1 REGS3.valid1 REGS8.valid1 REGS
9.valid1 REGS11.valid1 REGS12.valid1 REGS
15.valid1 REGS16.valid1 REGS17.valid1
  • Sample Program
  • ADD R0,R1,R2
  • ADD R4,R0,R3
  • ADD R7,R4,R0
  • ADD R10,R11,R12
  • ADD R13,R10,R15
  • ADD R13,R16,R17
  • ADD R15,R11,R12
  • ADD R17,R15,R12
  • EXIT

RAW
RAW
RAW
WAR
WAW
RAW
16
Results Cycles
17
Present Implementation
  • Completely Configurable Simulator
  • INT ALU in working State

18
Immediate Extension
  • Branch Unit Completion
  • Pipelined Multiplier Completion
  • LD/STORE Unit Completion
Write a Comment
User Comments (0)
About PowerShow.com