Title: Distributed Operating Systems CS551
1Distributed Operating SystemsCS551
- Colorado State University
- at Lockheed-Martin
- Lecture 4 -- Spring 2001
2CS551 Lecture 4
- Topics
- Memory Management
- Simple
- Shared
- Distributed
- Migration
- Concurrency Control
- Mutex and Critical Regions
- Semaphores
- Monitors
3Centralized Memory Management
- Review
- Memory cache, RAM, auxiliary
- Virtual Memory
- Pages and Segments
- Internal/External Fragmentation
- Page Replacement Algorithm
- Page Faults gt Thrashing
- FIFO, NRU, LRU
- Second Chance Lazy (Dirty Pages)
4Figure 4.1 Fragmentation in Page-Based Memory
versus a Segment-Based Memory. (Galli, p.83)
5Figure 4.2 Algorithms for Choosing Segment
Location. (Galli,p.84)
6Simple Memory Model
- Used in parallel NUMA systems
- Access times equal for all processors
- Too many processors
- gt thrashing
- gt need for lots of memory
- High performance parallel computers
- May not use cache -- to avoid overhead
- May not use virtual memory
7Shared Memory Model
- Shared memory can be a means of interprocess
communication - Virtual memory with multiple physical memories,
caches, and secondary storage - Easy to partition data for parallel processing
- Easy migration for load balancing
- Example systems
- Amoeba shared segments on same system
- Unix System V sys/shm.h
8Shared Memory via Bus
P1
P2
P5
P4
P3
P7
P8
P9
P10
P6
Shared Memory
9Shared Memory Disadvantages
- All processors read/write common memory
- Requires concurrency control
- Processors may be linked by a bus
- Too much memory activity may cause bus contention
- Bus can be a bottleneck
- Each processor may have own cache
- gt cache coherency (consistency) problems
- Snoopy (snooping) cache is a solution
10Bused Shared Memory w/Caches
P1
P2
P5
P4
P3
cache
cache
cache
cache
cache
cache
cache
cache
cache
cache
P7
P8
P9
P10
P6
Shared Memory
11Shared Memory Performance
- Try to overlap communication and computation
- Try to prefetch data from memory
- Try to migrate processes to processors that hold
needed data in local memory - Page scanner
- Bused shared memory does not scale well
- More processors gt bus contention
- Faster processors gt bus contention
12Figure 4.3 Snoopy Cache.(Galli,p.89)
13Cache Coherency (Consistency)
- Want local caches to have consistent data
- If two processor caches contain same data, the
data should have the same value - If not, caches are not coherent
- But what if one/both processors change the data
value? - Mark modified cache value as dirty
- Snoopy cache picks up new value as it is written
to memory
14Cache Consistency Protocols
- Write-through protocol
- Write-back protocol
- Write-once protocol
- Cache block invalid, dirty, or clean
- Cache ownership
- All caches snoop
- Protocol part of MMU
- Performs within a memory cycle
15Write-through protocol
- Read-miss
- Fetch data from memory to cache
- Read hit
- Fetch data from local cache
- Write miss
- Update data in memory and store in cache
- Write hit
- Update memory and cache
- Other local processors invalidate cache entry
16Distributed Shared Memory
- NUMA
- Global address space
- All memories together form one global memory
- True multiprocessors
- Maintains directory service
- NORMA
- Specialized message-passing network
- Example workstations on a LAN
17Distributed Shared Memory
P1
P2
P5
P4
P3
memory
memory
memory
memory
memory
memory
memory
memory
memory
memory
memory
P7
P8
P10
P11
P6
P9
18How to distribute shared data?
- How to distribute shared data?
- How many readers and writers are allowed for a
given set of data? - Two approaches
- Replication
- Data copied to different processors that need it
- Migration
- Data moved to different processors that need it
19Single Reader / Single Writer
- No concurrent use of shared data
- Data use may be a bottleneck
- Static
20Multiple Reader /Single Writer
- Readers may have a invalid copy after the writer
writes a new value - Protocol must have an invalidation method
- Copy set list of processors that have a copy of
a memory location - Implementation
- centralized, distributed, or combination
21Centralized MR/SW
- One server
- Processes all requests
- Maintains all data and data locations
- Increases traffic near server
- Potential bottleneck
- Server must perform more work than others
- Potential bottleneck
22Figure 4.4 Centralized Server for Multiple
Reader/Single Writer DSM. (Galli,p.92)
23Partially distributed centralization of MR/SW
- Distribution of data static
- One server receives all requests
- Requests sent to processor with desired data
- Handles requests
- Notifies readers of invalid data
24Figure 4.5 Partially Distributed Invalidation
for Multiple Reader/Single Writer DSM. (Galli,
p.92) Read X as C below
25Dynamic distributed MR/SW
- Data may move to different processor
- Send broadcast message for all requests in order
to reach current owner of data - Increases number of messages in system
- More overhead
- More work for entire system
26Ffigure 4.6 Dynamic Distributed Multiple
Reader/Single Writer DSM. (Galli, P.93)
27A Static Distributed Method
- Data is distributed statically
- Data owner
- Handles all requests
- Notifies readers when their data copy invalid
- All processors know where all data is located,
since it is statically located
28Figure 4.7 Dynamic Data Allocation for Multiple
Reader/Single Writer DSM.(Galli,p.96)
29Multiple Readers/Multiple Writers
- Complex algorithms
- Use sequencers
- Time read
- Time written
- May be centralized or distributed
30DSM Performance Issues
- Thrashing (in a DSM) when multiple locations
desire to modify a common data set (Galli) - False sharing Two or more processors fighting
to write to the same page, but not the same data - One solution temporarily freeze a page so one
processor can get some work done on it - Another proper block size ( page size?)
31More DSM Performance Issues
- Data location (compiler?)
- Data access patterns
- Synchronization
- Real-time systems issue?
- Implementation
- Hardware?
- Software?
32Mach Operating System
- Uses virtual memory, distributed shared memory
- Mach kernel supports memory objects
- a contiguous repository of data, indexed by
byte, upon which various operations, such as read
and write, can be performed. Memory objects act
as a secondary storage . Mach allows several
primitives to map a virtual memory object into an
address space of a task. In Mach, every task
has a separate address space. (Singhal
Shivaratri, 1994)
33Memory Migration
- Time-consuming
- Moving virtual memory from one processor to
another - When?
- How much?
34MM Stop and Copy
- Least efficient method
- Simple
- Halt process execution (freeze time) while moving
entire process address space and data to new
location - Unacceptable to real-time and interactive systems
35 Figure 4.8 Stop-and-Copy Memory Migration.
(Galli,p.99)
36Concurrent Copy
- Process continues execution while being copied to
new location - Some migrated pages may become dirty
- Send over more recent versions of pages
- At some point, stop execution and migrate
remaining data - Algorithms include dirty page ratio and/or time
criteria to decide when to stop - Wastes time and space
37 Figure 4.9 Concurrent-Copy Memory Migration.
(Galli,p.99)
38Copy on Reference
- Process stops
- All process state information is moved
- Process resumes at new location
- Other process pages are moved only when accessed
by process - Alternate may have virtual memory pages
transferred to file server, then moved as needed
to new process location
39Figure 4.10 Copy-on-Reference Memory Migration.
(Galli,p.100)
40Table 4.1 Memory Management Choices Available for
Advanced Systems. (Galli,p.101)
41Table 4.2 Performance Choices for Memory
Management. (Galli,p.101)
42Concurrency Control (Chapter 5)
- Topics
- Mutual Exclusion and Critical Regions
- Semaphores
- Monitors
- Locks
- Software Lock Control
- Token-Passing Mutual Exclusion
- Deadlocks
43Critical Region
- the portion of code or program accessing a
shared resource - Must prevent concurrent execution by more than
one process at a time - Mutex mutual exclusion
44Figure 5.1 Critical Regions Protecting a Shared
Variable. (Galli,p.106)
45Mutual Exclusion
- Three-point test (Galli)
- Solution must ensure that two processes do not
enter critical regions at same time - Solution must prevent interference from processes
not attempting to enter their critical regions - Solution must prevent starvation
46Critical Section Solutions
- Recall Silberschatz Galvin
- A solution to the critical section problem must
show that - mutual exclusion is preserved
- progress requirement is satisfied
- bounded-waiting requirement is met
47Figure 5.2 Example Utilizing Semaphores.
(Galli,p.109)
48Figure 5.3 Atomic Swap. (Galli,p.114)
49Figure 5.4 Centralized Lock Manager.
(Galli,p.116)
50Figure 5.5 Resource Allocation Graph.
(Galli,p.120)
51Table 5.1 Summary of Support for Concurrency by
Hardware, System, and Languages. (Galli,p.124)