Memory Coherence in Shared Virtual Memory Systems - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Memory Coherence in Shared Virtual Memory Systems

Description:

When a processor Q has a write fault to page P, Q will writes to all copies of p ... Modified Aegis operating system. Apollo ring network. Memory Coherent Algorithm ... – PowerPoint PPT presentation

Number of Views:148
Avg rating:3.0/5.0
Slides: 31
Provided by: dus9
Category:

less

Transcript and Presenter's Notes

Title: Memory Coherence in Shared Virtual Memory Systems


1
Memory Coherence in Shared Virtual Memory Systems
  • by
  • Kai Li, Princeton University
  • Paul Hudak, Yale University
  • Presented by Shu Du
  • Assisted by Charles Reis

2
Motivation
  • Parallel computing platform
  • Supercomputer
  • Powerful, but expensive
  • Cluster of workstations,PCs
  • Cheap and scalable
  • Parallel programming model
  • Message passing
  • Shared memory

3
Virtual Memory
Main Memory
g1
g2
g3
g4
g5
g6
Secondary Storage
g4
g5
g6
g7
g8
g9
Mapping Manager
View of the Applications
g1
g2
g3
g4
g5
g6
g7

Virtual Memory Space
4
Shared Virtual Memorys Architecture
Node N
CPU N

g1
g7
g5
Mapping Manager
View of the Applications
g1
g2
g3
g4
g5
g6
g7

Shared Virtual Memory
5
The Main Problem Memory Coherence
Node1
Node2
A memory is coherent if Read most recent
Write
CPU1
CPU2
g1
g2
g3
g1
g5
g6
1.Write g1
2.Read g1
g1
g2
g3
g4
g5
g6
g7

6
Overview
  • Introduction to the solutions
  • Centralized Manager Algorithm
  • Dynamic Manager Algorithm
  • Experiment and results
  • Conclusion

7
Introduction to the possible solutions
8
Page Synchronization Solutions(1)
  • Page Invalidation
  • When a processor Q has a write fault to page P,
  • Q will Invalidates all copies of p

Node1
Node2
CPU1
CPU2
1.Write fault to g6
g1
g2
g3
g1
g5
g6
g1
g2
g3
g4
g5
g6
g7

9
Page Synchronization Solutions(2)
  • Write broadcast
  • When a processor Q has a write fault to page P,
  • Q will writes to all copies of p

Node1
Node2
CPU1
CPU2
1.Write fault to g6
g1
g2
g6
g1
g5
g6
g1
g2
g3
g4
g5
g6
g7

10
Page Ownership Solutions
  • Owner Write access to the page
  • Fixed Ownership
  • Dynamic Ownership
  • Centralized manager
  • Distributed manager
  • Fixed managing server
  • Dynamic managing server

11
Solutions to the Memory Coherence
12
Centralized Manager Algorithm(CMA)
13
CMAs Data Structure
  • PTable
  • Kept by each processor
  • Access , Lock
  • Info table
  • Kept by centralized manager
  • Owner, Copy set, Lock

14
CMAs Read Fault Handler
Manager
Node2
Node1
CPU2
CPU1
Owner
1.ask Manager for p6

p4
p5
p6
p1
p2
3
p3
P6-gtN2
Copy Set

P6-gt
15
CMAs Write Fault Handler
Manager
Node2
Node1
CPU2
CPU1
Owner
1.ask Manager for p6

p4
p5
p1
p2
3
p3
p6
P6-gtN3
Copy Set

P6-gtN2
Node3
CPU2
p4
p5
p6
16
Summary of the CMA
  • Straightforward and easy to implement
  • But, have a traffic bottleneck

17
Distributed Manager Algorithm(DMA)
18
Fixed Manager
  • Predetermined subset of the pages to manage
  • Difficulty
  • To find the appropriate mapping from pages to
    the processors, since different applications may
    have different page access tendencies

19
Dynamic Manager
  • A simple way is to broadcast request to contact
    the manager, but that will bring a lot of
    overheads.
  • Use probOwner chain
  • Kept in each processors local Ptable
  • Initially, all probOwner set to one processor
  • Changes on write-page fault as well as a
    read-page fault
  • Point to the true owner or the probable one

20
Dynamic DMAsRead fault handling
Node 2
Node 3
Node 1
1.ask for p6
ProbOwner
ProbOwner
ProbOwner
P6-gtN2
P6-gtN3
P6-gtN3
Copy Set
Copy Set
Copy Set
P6-gtN2
P6-gtN2
P6-gt
Access
Access
Access
P6-gtN/A
P6-gtREAD
P6-gtREAD
p1
p2
p3
p4
p5
p6
p7
p8
p6
21
Dynamic DMAsWrite Fault Handling
Node 2
Node 3
Node 1
1.ask for p6
ProbOwner
ProbOwner
ProbOwner
P6-gtN2
P6-gtN3
P6-gtN3
Copy Set
Copy Set
Copy Set
P6-gtN2
P6-gtN2
P6-gt
Access
Access
Access
P6-gtN/A
P6-gtREAD
P6-gtREAD
p1
p2
p3
p4
p5
p6
p7
p8
p6
22
Summary of the Distributed Manager Algorithm
  • Fixed DMA alleviates the former bottleneck, but
    maybe not easy to find a good allocation scheme.
  • Dynamic DMA has good flexibility, and it is maybe
    easier to adapt the locality of the memory
    accesses.

23
Experiment and results
24
Experimental System-IVY
  • Integrated Shared Virtual Memory at Yale
  • Distributed Environment
  • Apollo DOMAIN computers
  • Modified Aegis operating system
  • Apollo ring network
  • Memory Coherent Algorithm
  • Centralized manager
  • Fixed distributed manager
  • Dynamic distributed manager

25
Benchmarks and Metric
  • Practical parallel Programs
  • Parallel Jacobi program for 3D partial
    differential equations(PDEs)
  • Parallel matrix multiply CAB
  • Parallel dot-product
  • SPEEDUP Tsingle-processor / Tmulti-processor

26
Speedup of 3D PDE
27
Speedup of dot-product
28
Speedup of Matrix multiplication
29
Overhead of the algorithms
30
Conclusion
  • Shared virtual memory implementation in the
    cluster system is indeed practical
  • Dynamic distributed manager algorithm has the
    most desirable overall features
Write a Comment
User Comments (0)
About PowerShow.com