Fast Communication - PowerPoint PPT Presentation

1 / 42

About This Presentation

Title:

Fast Communication

Description:

Fast Communication Firefly RPC Lightweight RPC CS 614 Tuesday March 13, 2001 Jeff Hoy Why Remote Procedure Call? Simplify building distributed systems and ... – PowerPoint PPT presentation

Number of Views:43

Avg rating:3.0/5.0

Slides: 43

Provided by: Jeff54

Learn more at: http://www.cs.cornell.edu

Category:

more less

Transcript and Presenter's Notes

Title: Fast Communication

1
Fast Communication

Firefly RPC
Lightweight RPC
CS 614
Tuesday March 13, 2001
Jeff Hoy

2
Why Remote Procedure Call?

Simplify building distributed systems and
applications
Looks like local procedure call
Transparent to user
Balance between semantics and efficiency
Universal programming tool
Secure inter-process communication

3
RPC Model
Client Application
Server Application
Return
Client Stub
Server Stub
Client Runtime
Server Runtime
Network
Call
4
RPC In Modern Computing

CORBA and Internet Inter-ORB Protocol (IIOP)
Each CORBA server object exposes a set of methods
DCOM and Object RPC
Built on top of RPC
Java and Java Remote Method Protocol (JRMP)
Interface exposes a set of methods
XML-RPC, SOAP
RPC over HTTP and XML

5
Goals

Firefly RPC
Inter-machine Communication
Maintain Security and Functionality
Speed
Lightweight RPC
Intra-machine Communication
Maintain Security and Functionality
Speed

6
Firefly RPC

Hardware
DEC Firefly multiprocessor
1 to 5 MicroVAX CPUs per node
Concurrency considerations
10 megabit Ethernet
Takes advantage of 5 CPUs

7
Fast Path in a RPC

Transport Mechanisms
IP / UDP
DECNet byte stream
Shared Memory (intra-machine only)
Determined at bind time
Inside transport procedures Starter,
Transporter, Ender, and Receiver for the
server

8
Caller Stub

Gets control from calling program
Calls Starter for packet buffer
Copies arguments into the buffer
Calls Transporter and waits for reply
Copies result data onto callers result variables
Calls Ender and frees result packet

9
Server Stub

Receives incoming packet
Copies data into stack, a new data block, or left
in the packet
Calls server procedure
Copies result into the call packet and transmit

10
Transport Mechanism

Transporter procedure
Completes RPC header
Calls Sender to complete UDP, IP, and Ethernet
headers (Ethernet is the chosen means of
communication)
Invoke Ethernet driver via kernel trap and queue
the packet

11
Transport Mechanism

Receiver procedure
Server thread awakens in Receiver
Receiver calls the stub interface included in
the received packet, and the interface stub calls
the procedure stub
Reply is similar

12
Threading

Client Application creates RPC thread
Server Application creates call thread
Threads operate in server applications address
space
No need to spawn entire process
Threads need to consider locking resources

13
Threading
14
Performance Enchancements

Over traditional RPC
Stubs marshal arguments rather than library
functions handling arguments
RPC procedures called through procedure variables
rather than by lookup table
Server retains call packet for results
Buffers reside in shared memory
Sacrifices abstract structure

15
Performance Analysis

Null() Procedure
No arguments or return value
Measures base latency of RPC mechanism
Multi-threaded caller and server

16
Time for 10,000 RPCs

Base latency 2.66ms
MaxResult latency (1500 bytes) 6.35ms

17
Send and Receive Latency
18
Send and Receive Latency

With larger packets, transmission time dominates
Overhead becomes less of an issue
Good for Firefly RPC, assuming large transmission
over network
Is overhead acceptable for intra-machine
communication?

19
Stub Latency

Significant overhead for small packets

20
Fewer Processors

Seconds for 1,000 Null() calls

21
Fewer Processors

Why the slowdown with one processor?
Fast path can be followed only in multiprocessor
environment
Lock conflicts, scheduling problems
Why little speedup past two processors?

22
Future Improvements

Hardware
Faster network will help larger packets
Triple CPU speed will reduce Null() time by 52
and MaxResult by 36
Software
Omit IP and UDP headers for Ethernet datagrams,
24 gain
Redesign RPC protocol 5 gain
Busy thread wait, 1015 gain
Write more in assembler, 510 gain

23
Other Improvements

Firefly RPC handles intra-machine communication
through the same mechanisms as inter-machine
communication
Firefly RPC also has very high overhead for small
packets
Does this matter?

24
RPC Size Distribution

Majority of RPC transfers under 200 bytes

25
Frequency of Remote Activity

Most calls are to the same machine

26
Traditional RPC

Most calls are small messages that take place
between domains of the same machine
Traditional RPC contains unnecessary overhead,
like
Scheduling
Copying
Access validation

27
Lightweight RPC (LRPC)

Also written for the DEC Firefly system
Mechanism for communication between different
protection domains on the same system
Significant performance improvements over
traditional RPC

28
Overhead Analysis

Theoretical minimum to invoke Null() across
domains kernal trap context change to call and
a trap context change to return
Theoretical minimum on Firefly RPC 109 us.
Actual cost 464us

29
Sources of Overhead

355us added
Stub overhead
Message buffer overhead
Not so much in Firefly RPC
Message transfer and flow control
Scheduling and abstract threads
Context Switch

30
Implementation of LRPC

Similar to RPC
Call to server is done through kernel trap
Kernel validates the caller
Servers export interfaces
Clients bind to server interfaces before making a
call

31
Binding

Servers export interfaces through a clerk
The clerk registers the interface
Clients bind to the interface through a call to
the kernel
Server replies with an entry address and size of
its A-stack
Client gets a Binding Object from the kernel

32
Calling

Each procedure is represented by a stub
Client makes a call through the stub
Manages A-stacks
Traps to the kernel
Kernel switches context to the server
Server returns by its own stub
No verification needed

33
Stub Generation

Procedure representation
Call stub for client
Entry stub for server
LRPC merges protocol layers
Stub generator creates run-time stubs in assembly
language
Portability sacrificed for Performance
Falls back on Modula2 for complex calls

34
Multiple Processors

LRPC caches domains on idle processors
Kernel checks for an idling processor in the
server domain
If a processor is found, caller thread can
execute on the idle processor without switching
context

35
Argument Copying

Traditional RPC copies arguments four times for
intra-machine calls
Client stub to RPC message to kernels message to
servers message to servers stack
In many cases, LRPC needs to copy the arguments
only once
Client stub to A-stack

36
Performance Analysis

LRPC is roughly three times faster than
traditional RPC
Null() LRPC cost 157us, close to the 109us
theoretical minimum
Additional overhead from stub generation and
kernel execution

37
Single-Processor Null() LRPC
38
Performance Comparison

LRPC versus traditional RPC (in us)

39
Multiprocessor Speedup
40
Inter-machine Communication

LRPC is best for messages between domains on the
on the same machine
The first instruction of the LRPC stub checks if
the call is cross-machine
If so, stub branches to conventional RPC
Larger messages are handled well, LRPC scales by
packet size linearly like traditional RPC

41
Cost

LRPC avoids needless scheduling, copying, and
locking by integrating the client, kernel,
server, and message protocols
Abstraction is sacrificed for functionality
RPC is built into operating systems (Linux DCE
RPC, MS RPC)

42
Conclusion

Firefly RPC is fast compared to most RPC
implementations. LRPC is even faster. Are they
fast enough?
The performance of Firefly RPC is now good
enough that programmers accept it as the standard
way to communicate (1990)
Is speed still an issue?

Write a Comment

User Comments (0)