Threadlevel Speculation - PowerPoint PPT Presentation

1 / 24

About This Presentation

Title:

Threadlevel Speculation

Description:

Hydra Architecture (CMP) speculative write buffer with write-through coherence scheme ... Hydra Architecture. FOR MORE INFO... – PowerPoint PPT presentation

Number of Views:77

Avg rating:3.0/5.0

Slides: 25

Provided by: rong9

Learn more at: https://cs.login.cmu.edu

Category:

more less

Transcript and Presenter's Notes

Title: Threadlevel Speculation

1
Thread-level Speculation

Presenter Minglong Shao
Rong Yan

2
Outline

Motivation of Thread-level Speculation
Design Issues
Case study
Summary

3
Outline

Motivation of Thread-level Speculation
Design Issues
Case study
Summary

?
4
Thread Level Parallelism

Break up computation into threads
Assign the threads to processors

Thread2
Program
Thread1
Thread3
5
How to deal with dependency

No data dependency

With data dependency

Original i1 j2 a3 bc ...
Original i1 j2 a3 bj ...
Vs.
Thread i i1j2
Thread i1 a3bc
Thread i i1j2
Time
Thread i1 a3bj
6
False Data Dependency
Sometimes complier cannot fully exploit potential
parallelism

Example pseudo-code
While (continue_condition)
xhashindex1
hashindex2y

Time
Thread i hash3 hash10..
Do you have any ideas to obtain run-time
parallelism in this case?
Thread i1 hash19 hash21..
7
Thread-Level Speculation

What is thread level speculation?
An approach that
Enables the compiler to create parallel threads
despite the existence of ambiguous data
dependence.

Run Time
Compile Time
Parallelize without detection of dependency
8
Thread-Level Speculation(Cont.)
Dynamically detect dependency in run-time
Time
Processor 0
Processor 1
Processor 2
Thread 1 hash3 Hash10..
Thread 2 hash19 Hash21..
Thread 3 hash10 Hash21..
Violation!
9
Thread-Level Speculation(Cont.)
Dynamically detect dependency in run-time
Time
Processor 0
Processor 1
Processor 2
Thread 1 hash3 Hash10..
Thread 2 hash19 Hash21..
Thread 3 hash10 Hash21..
Violation!
Thread 4 hash30 Hash40..
Redo
Thread 5 hash50 Hash60..
Thread 3 hash10 Hash21..
10
Outline

Motivation of Thread-level Speculation
Design Issues
Case study
Summary

?
11
Design Issue

Hardware/Software must provide the methods for
Detecting the true memory dependencies
Backing up and re-executing instructions
Buffering any data written during the
speculative region, for later committing /
discarding

12
Outline

Motivation of Thread-level Speculation
Design Issues
Case study
Summary

?
13
Case study

Multiscalar architecture all the dynamic control
is performed by hardware in runtime
TLDS architecture all thread control is handled
by software routine
Hydra Architecture (CMP) speculative write
buffer with write-through coherence scheme
Scalable Speculation Approach all kinds of
architecture writeback invalidation-based cache
coherence

14
Hydra Architecture

4 MIPS Processors Chip Multiprocessor (CMP)
Speculation Coprocessor execute software
exception handler
L1 Data cache with write-through
invalidation-based policy
L2 cache with speculation write buffers

FOR MORE INFO...
Please refer to the paper Data Speculation for a
Chip Multiprocessor
15
Data Cache Modification
FOR MORE INFO...
Please refer to the paper Data Speculation for a
Chip Multiprocessor
16
Downside of Hydra

Only reasonable in single chip
Not scalable to larger system
-- Write through scheme
-- Snooping write buffer upon every store

FOR MORE INFO...
Please refer to the paper Data Speculation for a
Chip Multiprocessor
17
Scalable Thread-level Speculation

Built on writeback invalidation-based cache
coherence
Scalable to arbitrary scale of architecture

FOR MORE INFO...
Refer to the paper A Scalable Approach to
Thread-Level Speculation
18
Example
Time
Processor 2 Epoch 6 become_speculative() ?LOAD a
p ?attempt_commit()
Processor 1 Epoch 5 ?STORE q 2
p q x
L1 Cache
L1 Cache
Epoch 5
Epoch 6
Violation? FALSE
Violation? FALSE

19
Example
Time
Processor 2 Epoch 6 become_speculative() ?LOAD a
p ?attempt_commit()
Processor 1 Epoch 5 ?STORE q 2
p q x
L1 Cache
L1 Cache
Epoch 5
Epoch 6
Violation? FALSE
Violation? FALSE

20
Example
Time
Processor 2 Epoch 6 become_speculative() ?LOAD a
p ?attempt_commit()
Processor 1 Epoch 5 ?STORE q 2
p q x
L1 Cache
L1 Cache
Epoch 5
Epoch 6
Violation? FALSE

21
When TLS is not desired?

Not desirable to invoke with frequent
dependency e.g. scalar variable
Solution -- accommodate the dependence through
synchronization -- turn off the speculation
support when necessary

22
Outline

Motivation of Thread-level Speculation
Design Issues
Case study
Summary

?
23
Summary

TLS enables compiler to create parallel threads
despite of uncertainty on actual dependency
Detect dependency on run-time
Several implementations --single chip/scalable,
write through / write back
Never simply count on TLS -- Combined with
synchronization, superscalar mechanism

24
( Thanks )

Write a Comment

User Comments (0)