Threadlevel Speculation - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Threadlevel Speculation

Description:

Hydra Architecture (CMP) speculative write buffer with write-through coherence scheme ... Hydra Architecture. FOR MORE INFO... – PowerPoint PPT presentation

Number of Views:77
Avg rating:3.0/5.0
Slides: 25
Provided by: rong9
Learn more at: https://cs.login.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: Threadlevel Speculation


1
Thread-level Speculation
  • Presenter Minglong Shao
  • Rong Yan

2
Outline
  • Motivation of Thread-level Speculation
  • Design Issues
  • Case study
  • Summary

3
Outline
  • Motivation of Thread-level Speculation
  • Design Issues
  • Case study
  • Summary

?
4
Thread Level Parallelism
  • Break up computation into threads
  • Assign the threads to processors

Thread2
Program
Thread1
Thread3
5
How to deal with dependency
  • No data dependency
  • With data dependency

Original i1 j2 a3 bc ...
Original i1 j2 a3 bj ...
Vs.
Thread i i1j2
Thread i1 a3bc
Thread i i1j2
Time
Thread i1 a3bj
6
False Data Dependency
Sometimes complier cannot fully exploit potential
parallelism
  • Example pseudo-code
  • While (continue_condition)
  • xhashindex1
  • hashindex2y

Time
Thread i hash3 hash10..
Do you have any ideas to obtain run-time
parallelism in this case?
Thread i1 hash19 hash21..
7
Thread-Level Speculation
  • What is thread level speculation?
  • An approach that
  • Enables the compiler to create parallel threads
    despite the existence of ambiguous data
    dependence.

Run Time
Compile Time
Parallelize without detection of dependency
8
Thread-Level Speculation(Cont.)
Dynamically detect dependency in run-time
Time
Processor 0
Processor 1
Processor 2
Thread 1 hash3 Hash10..
Thread 2 hash19 Hash21..
Thread 3 hash10 Hash21..
Violation!
9
Thread-Level Speculation(Cont.)
Dynamically detect dependency in run-time
Time
Processor 0
Processor 1
Processor 2
Thread 1 hash3 Hash10..
Thread 2 hash19 Hash21..
Thread 3 hash10 Hash21..
Violation!
Thread 4 hash30 Hash40..
Redo
Thread 5 hash50 Hash60..
Thread 3 hash10 Hash21..
10
Outline
  • Motivation of Thread-level Speculation
  • Design Issues
  • Case study
  • Summary

?
11
Design Issue
  • Hardware/Software must provide the methods for
  • Detecting the true memory dependencies
  • Backing up and re-executing instructions
  • Buffering any data written during the
    speculative region, for later committing /
    discarding

12
Outline
  • Motivation of Thread-level Speculation
  • Design Issues
  • Case study
  • Summary

?
13
Case study
  • Multiscalar architecture all the dynamic control
    is performed by hardware in runtime
  • TLDS architecture all thread control is handled
    by software routine
  • Hydra Architecture (CMP) speculative write
    buffer with write-through coherence scheme
  • Scalable Speculation Approach all kinds of
    architecture writeback invalidation-based cache
    coherence

14
Hydra Architecture
  • 4 MIPS Processors Chip Multiprocessor (CMP)
  • Speculation Coprocessor execute software
    exception handler
  • L1 Data cache with write-through
    invalidation-based policy
  • L2 cache with speculation write buffers

FOR MORE INFO...
Please refer to the paper Data Speculation for a
Chip Multiprocessor
15
Data Cache Modification
FOR MORE INFO...
Please refer to the paper Data Speculation for a
Chip Multiprocessor
16
Downside of Hydra
  • Only reasonable in single chip
  • Not scalable to larger system
  • -- Write through scheme
  • -- Snooping write buffer upon every store

FOR MORE INFO...
Please refer to the paper Data Speculation for a
Chip Multiprocessor
17
Scalable Thread-level Speculation
  • Built on writeback invalidation-based cache
    coherence
  • Scalable to arbitrary scale of architecture

FOR MORE INFO...
Refer to the paper A Scalable Approach to
Thread-Level Speculation
18
Example
Time
Processor 2 Epoch 6 become_speculative() ?LOAD a
p ?attempt_commit()
Processor 1 Epoch 5 ?STORE q 2
p q x
L1 Cache
L1 Cache
Epoch 5
Epoch 6
Violation? FALSE
Violation? FALSE


19
Example
Time
Processor 2 Epoch 6 become_speculative() ?LOAD a
p ?attempt_commit()
Processor 1 Epoch 5 ?STORE q 2
p q x
L1 Cache
L1 Cache
Epoch 5
Epoch 6
Violation? FALSE
Violation? FALSE


20
Example
Time
Processor 2 Epoch 6 become_speculative() ?LOAD a
p ?attempt_commit()
Processor 1 Epoch 5 ?STORE q 2
p q x
L1 Cache
L1 Cache
Epoch 5
Epoch 6
Violation? FALSE


21
When TLS is not desired?
  • Not desirable to invoke with frequent
    dependency e.g. scalar variable
  • Solution -- accommodate the dependence through
    synchronization -- turn off the speculation
    support when necessary

22
Outline
  • Motivation of Thread-level Speculation
  • Design Issues
  • Case study
  • Summary

?
23
Summary
  • TLS enables compiler to create parallel threads
    despite of uncertainty on actual dependency
  • Detect dependency on run-time
  • Several implementations --single chip/scalable,
    write through / write back
  • Never simply count on TLS -- Combined with
    synchronization, superscalar mechanism

24
( Thanks )
Write a Comment
User Comments (0)
About PowerShow.com