Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems presentation

About This Presentation

Transcript and Presenter's Notes

Title: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems

1
Treadmarks Distributed Shared Memory on Standard
Workstations and Operating Systems

P. Keleher, S. Dwarkadas, A. Cox, and W.
Zwaenepoel
The Winter Usenix Conference 1994

Presented by Hyuck Han
2
TOC

Preliminary
Motivation
Solution
Lazy Release Consistency
Multiple Writer Protocol
Evaluation
Conclusion

3
Programming Models (1)
4
Programming Models (2)

Message-passing model
MPI (Message Passing Interface) de facto
standard
Shared-address-space model
Shared-memory multiprocessors (SMPs)
Processes, threads, etc.
Distributed-memory machines
Hardware
CC-NUMA ( Cache Coherent Non-Uniform Memory
Access)
Software
DSM (Distributed Shared Memory)

5
DSM

Gives an illusion that memories are shared among
physically distributed nodes
Provides a uniform single address space view

6
Page-based DSM

Use virtual memory protection mechanisms and
signals
mmap(), mprotect()
SIGSEGV/SIGIO signals

7
Page-based DSM
8
TreadMarks Motivation (1)

Sequential Consistency
Every write visible immediately
Problems
of messages
latency

9
TreadMarks Motivation (2)

False sharing
Pieces of the same page updated by different
processors
Leads to ping-pong effect, increasing network
traffic enormously

10
TreadMarks Solutions

Lazy release consistency
Multiple-writer protocol

11
Relaxed Consistency Models

Delay making writes visible
Goal
Reduce of messages
Hide latency
Delay until when?

12
Release Consistency (1)

Eager Release Consistency (ERC) (Munin)
Write access information is delivered to all the
shared copies at the release point.
Release blocks until acknowledgments have been
received from all others.

13
Release Consistency (2)

Lazy Release Consistency (LRC)
Write access information is delivered only to the
next acquiring copy at the next acquire point.
Fewer messages

14
Release Consistency (3)

ERC vs. LRC message traffic

15
Multiple Writer Protocol (1)

Basic Idea
Buffer writes until synchronization events
Create diffs
Pull in modifications at synchronization events

16
Multiple Writer Protocol (2)

Lazy diff creation

17
TreadMarks
1 (A) Acquire ? two writes ? Release 2 (B)
Acquire send CVT(Current Vector Timestamp)
Lock request 3 (A) reply write notices (it
means Bs page is invalid) Lock grant 4
(B) access the page, delts(diffs) request (Lazy
Diff Creation) 5 (A) send diffs
18
TreadMarks

1 Generate Write notices
23 What does A send to B on the red line
- a list of intervals known to A but not to B
- for each interval in the list
- the origin node i and interval number n
- is vector clock CVTi during that interval n
on node i
- a list of pages dirtied by i during that
interval n
- these dirty page notifications are called
write notices
45 Lazy Diff Creation
- Diffs created only when the modifications are
requested
- Decreases number of diffs created

19
TreadMarks API

Global variables
extern unsigned Tmk_nprocs
extern unsigned Tmk_proc_id
Functions
void Tmk_startup (int argc, char argv)
void Tmk_exit (int status)
void Tmk_barrier (unsigned id)
void Tmk_lock_acquire (unsigned id)
void Tmk_lock_release (unsigned id)
char Tmk_malloc (unsigned size)
void Tmk_free (char ptr)

20
Performance

Experimental Environment
8 DECstation-5000/240
connected to a 100-Mbps ATM LAN and a 10-Mbps
Ethernet
Applications
Water molecular dynamics simulation
Jacobi Successive Over-Relaxation
TSP branch bound algorithm to solve the
traveling salesman problem
Quicksort using bubblesort to sort subarray of
less than 1K element
ILINK genetic linkage analysis

21
Result
22
Execution Time Breakdown (1/2)
23
Execution Time Breakdown (2/2)
24
Lazy vs. Eager Release Consistency(1/2)
25
Lazy vs. Eager Release Consistency (2/2)
26
Conclusion

Efficient user-level implementation
lazy release consistency, multiple-writer
protocols and lazy diff creation for reducing the
cost of communication
good speedups for Jacobi, TSP, Quicksort, ILINK
moderate sppedups for Water
viable technique for parallel computation on
clusters of workstations connected by suitable
networking technology

Write a Comment

User Comments (0)

About PowerShow.com

Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems PowerPoint PPT Presentation