CS 300 - PowerPoint PPT Presentation

1 / 20

About This Presentation

Title:

CS 300

Description:

Fetch (brings the instruction in from cache / memory) ... Start fetching and executing the next instruction before the current one has completed ... – PowerPoint PPT presentation

Number of Views:21

Avg rating:3.0/5.0

Slides: 21

Provided by: johnpe2

Learn more at: http://wiki.western.edu

Category:

more less

Transcript and Presenter's Notes

Title: CS 300

1
CS 300 Lecture 24

Intro to Computer Architecture
/ Assembly Language
The LAST Lecture!

2
Final Exam

Tuesday, 8am.
The exam will be comprehensive
I'll post a worksheet (practice exam) Thursday
evening that covers topics since the last exam.

3
Homework

The last homework is due Friday.
"darcs pull" is your friend minor bugs have
been found (and fixed) in the assembler.
Extra credit if you want it, come see me after
class. All EC stuff is due Friday of finals
week.

4
Speeding Up An Instruction

Typically, an instruction is executed in stages
Fetch (brings the instruction in from cache /
memory)
Decode (figure out what the instruction will
do)
Execute (the actual operation, like add or
multiply)
Store (place results in registers / memory)
This varies a lot from processor to processor but
the idea is always the same break up execution
into smaller chunks that overlap.

5
Other Speedup Strategies

Vector instructions explicit parallelism in
the instruction set to feed sequences of data to
a functional unit (Cray-1 and successors)
Multiple instructions at once pack instruction
words with lots of independent operations that
execute at the same time (VLIW)
Replicated CPU, one instruction stream (SIMD)
Multicore machines (MIMD) various coupling
strategies

6
Pipeline Hazards

Structural lack of computational resources to
perform operations in parallel
Data dependencies among instructions
(write-read)
Control conditional branching prevents you from
knowing which instruction comes next

7
Memory Hazards

It's "obvious" which registers an instruction
uses
It's hard to figure out how memory accesses
interact. Much harder to find out which memory
an instruction touches.
What can we do?
Reorder reads without worry
Reads and writes can't be switched unless we
know that they don't interfere
Separate "memory banks" (instruction / data)
reduces hazards

8
Compiler / Programmer Help

The compiler and programmer have access to
information that the CPU doesn't. We can
determine whether or not aliasing is possible
this is what makes it hard to reorganize memory
access.
A CPU isn't able to understand that two pointers
must point to different things, for example.

9
Delayed Branching

One interesting feature of the MIPS was the
delayed branch.
The instruction after a branch is always executed
that is, you want to have at least one
instruction in the pipeline before you have to
wait to see where the branch goes.
You can't see this the assembler is able to
fill branch slots for you behind your back.

10
User-Level Pipelining

One of the big ideas in software is to pipeline
at the application level
Anticipate future I/O
Use non-blocking reads (threading)
Understand whether writers may alter
information that is prefetched
Note that web browsers pipline web page display!

11
Pipeline Metaphors

The idea of keeping lots of workers busy at once
is pervasive. Managers have to deal with similar
issues when scheduling work
Not all workers can perform the same task
Workflow requires that things be passed from
one worker to another
Speculative work is often done (like a
restaurant preparing food that may or may not be
ordered)

12
Examples

The following slides are stolen from
Mary Jane Irwin (www.cse.psu.edu/mji )
www.cse.psu.edu/cg431

13
Single Cycle vs. Multiple Cycle Timing
14
How Can We Make It Even Faster?

Split the multiple instruction cycle into smaller
and smaller steps
There is a point of diminishing returns where as
much time is spent loading the state registers as
doing the work

Start fetching and executing the next instruction
before the current one has completed
Pipelining (all?) modern processors are
pipelined for performance

15
A Pipelined MIPS Processor

Start the next instruction before the current one
has completed
improves throughput - total amount of work done
in a given time
instruction latency (execution time, delay time,
response time - time from the start of an
instruction to its completion) is not reduced

Cycle 1
Cycle 2
Cycle 3
Cycle 4
Cycle 5
Cycle 7
Cycle 6
Cycle 8
Dec
lw
Dec
sw
Dec
R-type

clock cycle (pipeline stage time) is limited by
the slowest stage
for some instructions, some stages are wasted
cycles

16
Single Cycle, Multiple Cycle, vs. Pipeline
Multiple Cycle Implementation
17
Pipelining the MIPS ISA

What makes it easy
all instructions are the same length (32 bits)
can fetch in the 1st stage and decode in the 2nd
stage
few instruction formats (three) with symmetry
across formats
can begin reading register file in 2nd stage
memory operations can occur only in loads and
stores
can use the execute stage to calculate memory
addresses
each MIPS instruction writes at most one result
(i.e., changes the machine state) and does so
near the end of the pipeline (MEM and WB)
What makes it hard
structural hazards what if we had only one
memory?
control hazards what about branches?
data hazards what if an instructions input
operands depend on the output of a previous
instruction?

18
Can Pipelining Get Us Into Trouble?

Yes Pipeline Hazards
structural hazards attempt to use the same
resource by two different instructions at the
same time
data hazards attempt to use data before it is
ready
An instructions source operand(s) are produced
by a prior instruction still in the pipeline
control hazards attempt to make a decision about
program control flow before the condition has
been evaluated and the new PC target address
calculated
branch instructions