Cache Coherence Schemes for Multiprocessors - PowerPoint PPT Presentation

About This Presentation

Title:

Cache Coherence Schemes for Multiprocessors

Description:

Number of Views:292

Avg rating:3.0/5.0

Slides: 14

Provided by: OPE169

Learn more at: https://lass.cs.umass.edu

Category:

Tags: cache | coherence | eviction | multiprocessors | schemes

Transcript and Presenter's Notes

Title: Cache Coherence Schemes for Multiprocessors

1
Cache Coherence Schemes for Multiprocessors
Sivakumar M Osman Unsal
2

Consistency
Different Directory Schemes
Comparison of Directory schemes
Hierarchical Directory scheme (in detail)
Referred Papers
Directory-Based Cache Coherence in Large-Scale
Multiprocessors, David Chaiken, Craig Fields,
Kiyoshi Kurihara and Anant Agarwal
A Survey of Cache Coherence Schemes for
Multiprocessors, Per Stenstrom
Cache Consistency and Sequential Consistency,
James R Goodman
LimitLess Directories A Scalable Cache
Coherence Schemes, David Chaiken, John
Kubiatowicz and Anant Agarwal
A Hierarchical Directory Scheme for Large-Scale
Cache-Coherent Multiprocessors, A Dissertation
by Yeong-Chang Maa

3
CONSISTENCY

Strict Consistency
Any read to memory location X returns the value
stored by the most recent write operation to X
P1 W(x)1 P1 W(x)1
P2 R(x)1 P2 R(x)0 R(x)1
Sequential Consistency Program order Memory
coherence
The result of any execution is the same as if
the operations of all processors were executed in
some sequential order, and the operations of each
individual processor appear in this sequence in
the order specified y its program
P1 W(x)1 P1 W(x)1
P2 R(x)0 R(x)1 P2 R(x)1 R(x)1

4
CONSISTENCY

Causal Consistency
Writes that are potentially causally related
must be seen by all process in the same order.
Concurrent writes may be seen in a different
order on different machines.
P1 W(x)1 W(x)3
P2 R(x)1 W(x)2
P3 R(x)1 R(x)3 R(x)2
P4 R(x)1 R(x)2 R(x)3
PRAM Consistency
Writes done by a single process are received by
all other process in the order in which they are
issued, but writes from different processes may
be seen in a different order by different
processes.
Processor Consistency
For every memory location X, there should be a
global agreement about the order of writes to X

5
CONSISTENCY

Weak Consistency
Using Synchronization variable which are
sequentially consistent
No access to a synchronization variable is
allowed until all previous writes have completed
everywhere
No data access is allowed until all previous
access to synchronization variable have been
performed
Release Consistency
Barrier synchronization Acquire and Release
Acquire and Release should be processor
consistent
Lazy release and Eager release consistencies
Entry Consistency
Locks for each shared variable or element

6
Directory based cache coherence

7
Directory Schemes

8
Directory Schemes

Comparison of full-mapped, limited, chained
schemes
Metric Processor Utilization
Utilization depends on frequency of Memory
reference and latency of memory system
Latency depends on topology, speed, number of
processors, memory access latency, frequency and
size of messages

9
Directory Schemes

10
Directory Schemes

Coarse Vector DiriCVr
Initially behaves as limited directory
Switches to fully mapped
Dir0B
2 status bit for 4 states Absent, Present1
present and clean in only one cache, Present
present and clean in more than one cache,
PresentM present and dirty in only one cache
LimitLess Directory Scheme
Combination of hardware and software
techniques
Realize performance of full-map
directory
Memory overhead of limited directory
Sectored Directory DirN/L
L sub-blocks share the directory
Overhead is MN/L