User-Level Interprocess Communication for Shared Memory Multiprocessors

About This Presentation

Title:

User-Level Interprocess Communication for Shared Memory Multiprocessors

Description:

User-Level Interprocess Communication for Shared Memory Multiprocessors ... IPC is central to OS design. Encourages decomposition across address space boundaries ... – PowerPoint PPT presentation

Number of Views:94

Avg rating:3.0/5.0

Slides: 19

Provided by: webCe

Learn more at: http://web.cecs.pdx.edu

Category:

more less

Transcript and Presenter's Notes

Title: User-Level Interprocess Communication for Shared Memory Multiprocessors

1
User-Level Interprocess Communication for Shared
Memory Multiprocessors

Brian N. Bershad, Thomas E. Anderson, Edward D.
Lazowska, and Henry M. Levy
Presented By Yahia Mahmoud

2
Introduction

IPC is central to OS design
Encourages decomposition across address space
boundaries
Failure Isolation no leaking
Extensibility adding modules dynamically
Modularity
Slow IPC gt trade of between performance and
decomposition

3
Problems

IPC has been kernel responsibility two
problems
Architectural performance barriers
Overhead of Kernel-mediated cross address space
call 70 of LRPC overhead
Interaction with user-level threads
Partitioning of communication and thread
management across protection boundaries has high
performance cost.

4
Solution

Eliminate kernel from cross-address space
communication
Use shared memory as data transfer channel
Avoid processor reallocation use already active
processor in target address space
This approach results in improved performance
because

5
Advantages

Messages sent without kernel involved
Avoid processor reallocation. Reduce call
overhead and preserve cache
Processor reallocation overhead can be amortized
over several calls
Exploit the parallelism of send and receive of
messages

6
URPC

Messages are passed through logical memory
channels created and mapped for every
client/server
User-level Thread management no kernel involved
in messages sent

7
Software Components
8
URPC

Separates IPC into three components
Data transfer
Thread management
Processor reallocation
Goals
Move 1 and 2 into user-level
3 needs the kernel but try to avoid it. Why?
Scheduling cost to decide which address space
runs on the processor
Virtual memory mapping costs
Long-term costs because of cache and TLB
invalidated

9
URPC

Calls appear synchronous to programmer but
asynchronous beneath the system
Client thread blocks and another ready thread
runs
LRPC used the same blocked thread to run the
ready thread
URPC always tries to schedule another thread
Avoid context switching overhead
If load balancing is needed client lends its
processor to the server via kernel and server
returns back after processing messages

10
Processor Reallocation

Kernel uses pessimistic reallocation (handoff
scheduling)
This policy does not always improve performance
Kernel centralized data structure creates
performance bottleneck (lock contention, thread
run queues and message channels)

11
Processor Reallocation

Use Optimistic reallocation policy
The client has other work to do no performance
side effect of delaying message processing at
server side gt inexpensive context switch
Server is not underpowered has or will have
processor to process message gt client executes
in parallel with server
This wont hold in case of time sensitive service
is needed (real time system, high latency I/O
ops)
In case of reallocation is needed its done via
kernel
Idle processor can donate itself to underpowered
address space
The identity of the donating processor is made
known to the receiver
No guarantee that processor will be returned back
to donor

12
Example
13
Data Transfer

Arguments can be passed using shared memory and
still guarantee safety
Stubs are responsible for communication safety
On receipt of a message, stubs unmarshal the data
into parameters and do the needed copying and
checking to ensure application safety
No need to use kernel
stubs can do the copying directly
Data are passed on stack or heap and stubs copy
them directly
In case of type safe language copying into kernel
does not guarantee type checking of data
Shared memory queues are controlled with
test-and-set locks with no spinning

14
Thread Management

Fine-grained parallel application needs high
performance thread management which could only be
achieved by implementing in user-level
Communication Thread management can achieve
very good performances when both are implemented
at user-level
Threads block in order to
Synchronize their activities in same address
space
Wait for external events from different address
space
Communication implemented at kernel level will
result in synchronization at both user level and
kernel level

15
Performance
16
Call Latency and Throughput

Call Latency is the time from which a thread
calls into the stub until control returns from
the stub.
These are load dependent, and depend on
Number of Client Processors (C)
Number of Server Processors (S)
Number of runnable threads in the clients
Address Space (T)
The graphs measure how long it takes to make
100,000 Null procedure calls into the server in
a tight loop

17
Call Latency and Throughput
18
Conclusions

Performance gains from moving features out of the
kernel not vice-versa
URPC represents appropriate division for
operating system kernels of shared memory
multiprocessors
URPC showcases a design specific to a
multiprocessor, not just uniprocessor design that
runs on multiprocessor hardware

Write a Comment

User Comments (0)

About PowerShow.com

User-Level Interprocess Communication for Shared Memory Multiprocessors - PowerPoint PPT Presentation

User-Level Interprocess Communication for Shared Memory Multiprocessors

User-Level Interprocess Communication for Shared Memory Multiprocessors ... IPC is central to OS design. Encourages decomposition across address space boundaries ... – PowerPoint PPT presentation