Title: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems
1Treadmarks Distributed Shared Memory on Standard
Workstations and Operating Systems
- P. Keleher, S. Dwarkadas, A. Cox, and W.
Zwaenepoel - The Winter Usenix Conference 1994
Presented by Hyuck Han
2TOC
- Preliminary
- Motivation
- Solution
- Lazy Release Consistency
- Multiple Writer Protocol
- Evaluation
- Conclusion
3Programming Models (1)
4Programming Models (2)
- Message-passing model
- MPI (Message Passing Interface) de facto
standard - Shared-address-space model
- Shared-memory multiprocessors (SMPs)
- Processes, threads, etc.
- Distributed-memory machines
- Hardware
- CC-NUMA ( Cache Coherent Non-Uniform Memory
Access) - Software
- DSM (Distributed Shared Memory)
5DSM
- Gives an illusion that memories are shared among
physically distributed nodes - Provides a uniform single address space view
6Page-based DSM
- Use virtual memory protection mechanisms and
signals - mmap(), mprotect()
- SIGSEGV/SIGIO signals
7Page-based DSM
8TreadMarks Motivation (1)
- Sequential Consistency
- Every write visible immediately
- Problems
- of messages
- latency
9TreadMarks Motivation (2)
- False sharing
- Pieces of the same page updated by different
processors - Leads to ping-pong effect, increasing network
traffic enormously
10TreadMarks Solutions
- Lazy release consistency
- Multiple-writer protocol
11Relaxed Consistency Models
- Delay making writes visible
- Goal
- Reduce of messages
- Hide latency
- Delay until when?
12Release Consistency (1)
- Eager Release Consistency (ERC) (Munin)
- Write access information is delivered to all the
shared copies at the release point. - Release blocks until acknowledgments have been
received from all others.
13Release Consistency (2)
- Lazy Release Consistency (LRC)
- Write access information is delivered only to the
next acquiring copy at the next acquire point. - Fewer messages
14Release Consistency (3)
- ERC vs. LRC message traffic
15Multiple Writer Protocol (1)
- Basic Idea
- Buffer writes until synchronization events
- Create diffs
- Pull in modifications at synchronization events
16Multiple Writer Protocol (2)
17TreadMarks
1 (A) Acquire ? two writes ? Release 2 (B)
Acquire send CVT(Current Vector Timestamp)
Lock request 3 (A) reply write notices (it
means Bs page is invalid) Lock grant 4
(B) access the page, delts(diffs) request (Lazy
Diff Creation) 5 (A) send diffs
18TreadMarks
- 1 Generate Write notices
- 23 What does A send to B on the red line
- - a list of intervals known to A but not to B
- - for each interval in the list
- - the origin node i and interval number n
- - is vector clock CVTi during that interval n
on node i - - a list of pages dirtied by i during that
interval n - - these dirty page notifications are called
write notices - 45 Lazy Diff Creation
- - Diffs created only when the modifications are
requested - - Decreases number of diffs created
19TreadMarks API
- Global variables
- extern unsigned Tmk_nprocs
- extern unsigned Tmk_proc_id
- Functions
- void Tmk_startup (int argc, char argv)
- void Tmk_exit (int status)
- void Tmk_barrier (unsigned id)
- void Tmk_lock_acquire (unsigned id)
- void Tmk_lock_release (unsigned id)
- char Tmk_malloc (unsigned size)
- void Tmk_free (char ptr)
20Performance
- Experimental Environment
- 8 DECstation-5000/240
- connected to a 100-Mbps ATM LAN and a 10-Mbps
Ethernet - Applications
- Water molecular dynamics simulation
- Jacobi Successive Over-Relaxation
- TSP branch bound algorithm to solve the
traveling salesman problem - Quicksort using bubblesort to sort subarray of
less than 1K element - ILINK genetic linkage analysis
21Result
22Execution Time Breakdown (1/2)
23Execution Time Breakdown (2/2)
24Lazy vs. Eager Release Consistency(1/2)
25Lazy vs. Eager Release Consistency (2/2)
26Conclusion
- Efficient user-level implementation
- lazy release consistency, multiple-writer
protocols and lazy diff creation for reducing the
cost of communication - good speedups for Jacobi, TSP, Quicksort, ILINK
- moderate sppedups for Water
- viable technique for parallel computation on
clusters of workstations connected by suitable
networking technology