Title: CS 7103 Advanced Operating Systems Louisiana State University Rajgopal Kannan
1 Distributed Shared Memory
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
2 Classification of Multiprocessor Memory
- Uniform Memory Access (UMA)
Network
MM
PE
Cache
PE
MM
Cache
2. Non-Uniform Memory Access (NUMA)
MM
Network
PE
Cache
MM
PE
Cache
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
3 Mapping Virtual Memory Space
Virtual Memory Space
MM
MM
MM
MM
MM
MM Frame Offset
Page Table Entry
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
4 Performance and Transparency
- Temporal and Spatial locality of references may
be exploited using cache - Scalability may be improved by using improved
network and/or by using cache. - DSM helps the application programmer by providing
communication transparency. - Programmers are more familiar with shared memory.
Further many software are available for single
processor and multiprocessor shared memory
systems.
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
5 Data Placement, Migration and Replication
The performance of a DSM may be improved by 1.
Placing a data block in appropriate memory
module, 2. By migrating data block to an
appropriate memory module and 3. by replicating
block.
The performance of migration and replication
strategies are measured by hit ratio. Block-migra
tion strategies suffer from block-bouncing
problem. Both block-migration and
block-replication strategies suffer from the
problem of false-sharing.
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
6 Memory Consistency Models
- In a distributed system, it is not possible to
enforce uniprocessor-like coherency for shared
data items. - There is no global clock. Therefore, it is not
possible to determine latest write. - There will always be delays (non-deterministic).
Hence we use some weaker consistency model for
shared data in distributed environment. The
programmer is supposed to know what kind of
consistency is guaranteed and act accordingly.
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
7 General Access Consistency Models
- If the user can not give any synchronization
information, then general access consistency
models are used. The general access consistency
models are - Atomic consistency (usual uniprocessor
consistency, a read never gets stale data) - Sequential consistency
- Causal consistency
- Processor consistency
- Slow memory consistency.
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
8 Sequential Consistency
The operation of all processors are executed in
some sequential order and the operations of each
individual processor are executed in the order
specified by its program.
P1 W(X)1 P2 W(Y)2 P3
R(Y)2 R(X)0 R(X)1
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
9 Causal Consistency
Only causally related writes must appear in same
order to all processors.
P1 W(X)1 W(X)3 P2 RX(1) W(X)2 P3
R(X)1 R(X)3 R(X)2 P4 R(X)1 R(X)2 R(X)3
Causally consistent but not sequentially
consistent.
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
10 Processor Consistency
Writes from same processor must appear in the
order they were issued.
P1 W(X)1 P2 RX(1) W(X)2 P3
R(X)1 R(X)2 P4 R(X)2 R(X)1
Processor consistent, but neither causally
consistent nor sequentially consistent.
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
11 Slow Memory Consistency
Writes from the same processor only in the same
location (memory address) must appear in the
order they were issued.
P1 W(X)1 W(Y)0 W(Y)2 W(X)3 P2 R(X)1 R(X)3
R(Y)0
Slow memory consistent, but not processor
consistent.
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
12 Synchronization Access Consistency Models
- The user controls access to the ordinary shared
variable by using semaphore or other explicit
synchronizing variables. The system provides some
consistency only for synchronizing variables. - Synchronization access consistency models are
- Weak consistency
- Release consistency
- Entry consistency.
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
13 Weak Consistency
Access to a synchronization variable is synch(S).
- Access to all synchronizing variables are
sequentially consistent - A processor must not access a synchronizing
variable with pending access to any data - A processor must not access any variable with
pending access to a synchronizing variable.
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
14 Release Consistency
Access to a synchronization variable are
acquire(S) and release(S).
- Access to all synchronizing variables are
processor consistent - Access to all shared variables between acquire
and release are exclusive. - Changes made to shared variables are made visible
to the outside world after release.
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
15 Entry Consistency
Access to a synchronization variable are
acquire(X) and release(X).
- Access to all synchronizing variables are
processor consistent - Access to shared variable X between acquire and
release are exclusive. - Changes made to X are made visible to the outside
world after release.
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
16 Multiprocessor Cache System
Master copy E
lt Processor- bits gt
Replicated block V E
V valid/invalid E - exclusive
Replicated block V E
Replicated block V E
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
17 Four Events in Cache Management
- Any cache manager faces the following events, and
must take appropriate actions to maintain
functionality - Read-hit
- Read-miss
- Write-hit
- Write-miss
The performance of a cache management algorithm
is judged by both hit-ratios and the overhead
that occurs at each of the four events.
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
18 Write-Invalidate Consistency Protocol
Event Action by processor
Action by directory
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
19 Write-Update Consistency Protocol
Event Action by processor
Action by directory
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
20 Snooping Cache
If the communication medium is a broadcast
medium like bus, there is no need of the
directory, as all processors can monitor all
cache related messages and take appropriate
actions. Clearly, such systems can implement
sequential consistency.However, bus is a
potential bottleneck and a single point of
failure. Several approaches are taken to
alleviate the problem.1. Cache traffic is
carried on a separate bus. 2. Multiple bus in
hierarchical configuration. 3. Multistage
interconnection networks (MIN) made of bus.
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
21 Distributed Shared Memory
In a DSM system, memory modules are strictly
local and communication is only via high-latency
links. Therefore, such systems simulates/emulates
shared memory by message passing.
- Obviously, a process in such a system finds its
address-space distributed in the memories of
different processors. Therefore, it has three
options while accessing a non-local memory page. - Remote access
- Migration
- Replication
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
22 DSM Management Algorithms
- With three access mechanisms and two access types
(read and write), there are nine possible DSM
management algorithms. Out of nine, the following
four are more interesting. - Read-remote-write-remote
- Read-migrate-write-migrate
- Read-replicate-write-migrate
- Read-replicate-write-replicate
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
23 Read-Remote-Write-Remote
- One or multiple server serves the memory pages.
- Simple to implement
- If implemented with a single server, the system
is always sequentially consistent provided that
the server serializes request and response
services. - High latency
- Servers are potential bottlenecks.
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
24 Read-Migrate-Write-Migrate
- A memory page migrates to the new processors
memory upon access - Coherence problem need not be handled separately
- Exploits program localities well
- Upon migration, all processors sharing the page
need to update their virtual page to physical
block mapping - Ping-pong effect is a problem that becomes more
severe by false sharing. May be handled by
smaller page size (associated with high
overhead), or a combination of migrate-replicate-r
emote access strategy.
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
25 Read-Replicate-Write-Migrate
- Same as the write-invalidate strategy for cache
management. - Popular because many software use the
multiple-read/exclusive-write semantic. - Performs well when reads are dominant operation.
- In the absence of broadcast support in hardware,
write becomes a costly operation. - Maintains strong consistency.
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
26 Read-Replicate-Write-Replicate
- Same as the write-update strategy for cache
management. - Performs well when reads are not dominant
operation. - In the absence of broadcast support in hardware,
write becomes a very costly operation. - Maintains strong consistency only when used with
a suitable protocol like two-phase commit.
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
27 Software Based Replication/Migration Management
- Two problems need to be solved
- Finding the current owner of a page (if there is
migration). - Finding the set of processors with a copy of the
page (if there is replication.)
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
28 Finding Owner
- If there is a master owner, that may keep track
of the current owner - Otherwise, the following data structure may be
used.
Block probable owner
Block probable owner
Block probable owner
Block probable owner
Block probable owner
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
29 Finding Copy-set
From Tos
From Tos
From Tos
Spanning Tree
From NIL
From NIL
Forwarded
Head master next
Head master next
Head master next
Head master next
Update Invalidate Request
Acknowledgement
Requestor
Linked list
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
30 Classification of DSMs
- Distributed Shared Memory systems are
classified on the basis of the following criteria - Implementation Hardware, Software or Hybrid
- Architecture Configuration or Interconnection
Topology Bus, Ring, Cube, MIN etc. - Shared Data Organization Structured (Objects,
higher level types etc.) or Non-structured. - Granularity/Coherence unit Word, Cache, Block,
Page, Object etc. - DSM Algorithms SRSW, MRSW, MRMW.
- Management Responsibility Distributed or
centralized - Consistency Model
- Coherence Protocol
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
31 LAM, A DSM
Processor
Processor
Processor
Processor
Network Interconnect
Cache
Cache
Cache
Cache
Local Bus
Read/Write
LB to VB Interface
Node Local Memory
Lock Space Control Spc Data Spc
Write RM BUS
Write
VME Bus
I/O
I/O
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan
32 LAM Properties
- Implementation Hybrid, Reflective memory
library routines - Architecture Configuration or Interconnection
Topology Hierarchical bus - Shared Data Organization Structured (Data
structure) - Granularity/Coherence unit Data structure
- DSM Algorithms MRMW.
- Management Responsibility - Distributed
- Consistency Model Entry consistency
- Coherence Protocol Write-update
CS 7103 Advanced Operating Systems
Louisiana State University
Rajgopal Kannan