TreadMarks - PowerPoint PPT Presentation

About This Presentation
Title:

TreadMarks

Description:

Multi-writer Protocol (diffs) TreadMarks Algorithm. Implementation. Limitations. DSM Overview. Global address space virtualization of disparate physical memory ... – PowerPoint PPT presentation

Number of Views:175
Avg rating:3.0/5.0
Slides: 32
Provided by: eecgTo
Category:

less

Transcript and Presenter's Notes

Title: TreadMarks


1
TreadMarks
  • Distributed Shared Memory on Standard
    Workstations and Operating Systems

Pete Keleher, Alan Cox, Sandhya Dwarkadas, Willy
Zwaenepoel
2
Agenda
  • DSM Overview
  • TreadMarks Overview
  • Vector Clocks
  • Multi-writer Protocol (diffs)
  • TreadMarks Algorithm
  • Implementation
  • Limitations

3
DSM Overview
Proc
Proc
Proc
Proc
Mem
Mem
Mem
Mem
  • Global address space virtualization of disparate
    physical memory
  • Program using normal thread/locking techniques
    (no MPI)

4
DSM Overview
Proc
Proc
Proc
Proc
Mem
Mem
Mem
Mem
  • Communication overhead incurred to synchronize
    memory
  • Maximize parallel computation and limit
    communication to improve performance

5
TreadMarks Overview
  • Minimize communications to improve DSM
    performance
  • Lazy Release Consistency (Vector Clocks)
  • Multiple Writers (Lazy Diff Creation)
  • Delay communication as long as possible (possibly
    even avoid)

6
TreadMarks OverviewRelease Consistency
  • Release Consistency
  • Shared memory updates must be visible when the
    release is visible
  • No need to send updates immediately upon write

w(x)
P1
P2
w(x)
7
TreadMarks OverviewLazy Release Consistency
  • Lazy Release Consistency
  • Shared memory updates are not made visible until
    the time of acquire
  • No update propagated if update never acquired

w(x)
P1
P2
w(x)
8
Vector Clocks
P1
P2
P3
  • Global clock mechanism for identifying causal
    ordering of events in distributed systems
  • Mattern (1989) and Fidge (1991)

9
Vector Clocks
0 0 0
P1
0 0 0
P2
0 0 0
P3
  • Each process maintains a vector of counters
  • One for each process in the system

10
Vector Clocks
0 0 0
P1
0 0 0
P2
0 0 0
P3
  • Each process maintains a vector of counters
  • One for each process in the system

11
Vector Clocks
0 0 0
P1
1 0 0
0 0 0
P2
0 0 0
P3
  • Increments own counter upon Local Event

12
Vector Clocks
0 0 0
P1
1 0 0
0 0 0
P2
0 0 0
P3
0 0 1
  • Increments own counter upon Local Event

13
Vector Clocks
0 0 0
P1
2 0 2
1 0 0
0 0 0
P2
0 0 0
P3
0 0 1
0 0 2
  • Increments own counter and updates all other
    counters upon Receiving Message

14
Vector Clocks
0 0 0
P1
2 0 2
1 0 0
3 0 2
0 0 0
P2
3 1 2
0 0 0
P3
0 0 1
0 0 2
  • Increments own counter and updates all other
    counters upon Receiving Message

15
Diff Creation
  • Retains copy of page upon first writing

P1
P2
16
Diff Creation
  • Retains copy of page upon first writing

P1
P2
17
Diff Creation
  • Create diff by comparing modified page against
    original (RLC)

P1
P2
18
Diff Creation
  • Send diff to other processes

P1
P2
19
Lazy Diff Creation
  • Diffs created only when a page is invalidated
  • Or the modifications are requested explicitly
  • access miss on invalidated page

P1
P2
20
TreadMarks Algorithm
0 0 0
P1
1 0 0
0 0 0
P3
0 0 1
  • P1 Cannot proceed past acquire until
  • All modifications have been received from
    processes whose vector timestamps are smaller
    P1s

21
TreadMarks Algorithm
0 0 0
P1
1 0 0
0 0 0
P3
0 0 1
1 0 0
  • On acquire
  • P1 Sends Vector Timestamp to releaser

22
TreadMarks Algorithm
0 0 0
P1
1 0 0
0 0 0
P3
0 0 1
1 0 0
1 0 1
invalidate
  • On acquire
  • P1 Sends Vector Timestamp to releaser
  • P2 Attaches invalidations for all updated counters

23
TreadMarks Algorithm
0 0 0
P1
1 0 0
1 0 1
invalidate
0 0 0
P3
0 0 1
1 0 1
invalidate
  • On acquire
  • P1 Sends Vector Timestamp to releaser
  • P2 Attaches invalidations for all updated
    counters
  • P2 Sends updated Vector Timestamp with
    invalidations

24
TreadMarks Algorithm
0 0 0
diff
P1
w(x)
1 0 0
1 0 1
invalidate
0 0 0
P3
0 0 1
diff
  • Diffs generated when
  • Receiving invalidation (i.e. P1 had made prior
    updates to this page also)
  • Page is accessed (miss)

25
TreadMarks ImplementationData Structures
Page array
26
TreadMarks ImplementationLocks
  • Each lock is statically assigned a manager (RR)
  • Keeps track of processors
  • Lock acquires are sent to manager (forwarded to
    last processor to obtain lock)
  • Upon release, sends (for each interval)
  • Processor ID and Vector Timestamp
  • Any invalidations that are necessary

27
TreadMarks ImplementationBarriers
  • Centralized barrier Manager
  • Upon arrival at barrier
  • Notifies Manager of intervals that the manager
    does not already have
  • Incorporated when Manager arrives at barrier
  • When all clients have arrived
  • Manager notifies all clients of intervals they do
    not already have
  • Expensive

28
Limitations
  • Achieved nearly linear speedup for TSP, Jacobi,
    Quicksort, ILINK algorithms
  • Water
  • Each molecule in simulation is protected by lock
    and frequently accessed
  • Barriers used in synchronization
  • Speedup is limited by low computation to
    communication ratio of algorithm (many
    fine-grained messages)

29
Limitations
  • TSP
  • Eager Release Consistency performs better than
    Lazy Release Consistency (Fig. 9)
  • Updates occur on invalidation and access
    misses(writes/synchronization points)
  • TSP algorithm reads stale current minimum value
    without synchronization

30
Limitations
  • Depends on events (write/synchronization) to
    trigger consistency operations
  • More opportunities to read stale data (TSP)
  • Reduced redundancy increases risk of data loss

31
Summary
  • Improves performance by improving computation to
    communication ratio
  • Delay consistency updates until page access is
    acquired
  • Weaker consistency implies greater likelihood of
    reading stale data and data loss
  • Procrastination Performance
Write a Comment
User Comments (0)
About PowerShow.com