Multiple Bypass: Interposition agents for distributed computing - PowerPoint PPT Presentation

About This Presentation
Title:

Multiple Bypass: Interposition agents for distributed computing

Description:

Title: Bypass: A tool for building split execution systems Last modified by: Douglas Thain Created Date: 7/27/2000 9:03:37 PM Document presentation format – PowerPoint PPT presentation

Number of Views:85
Avg rating:3.0/5.0
Slides: 65
Provided by: pagesCsW76
Category:

less

Transcript and Presenter's Notes

Title: Multiple Bypass: Interposition agents for distributed computing


1
Multiple BypassInterposition agents
fordistributed computing
2
Overview
  • Good news and bad news.
  • Our solution Bypass
  • Three simple (but useful) examples
  • Problems
  • Impedance Matching
  • Composition
  • Related and Future Work

3
Good News!
  • New distributed systems give you access to untold
    computing resources around the world.

4
Bad News
  • Your programs wont run on them.

5
core dumped
remote machine
home machine
6
Why not?
  • Interface mismatch
  • open() ! OpenFile()
  • open() ! super_duper_open()
  • Resource mismatch
  • Open(datafile) -gt doesnt exist!
  • Open(output) -gt no space for you!
  • Getpwnam(thain) -gt who is that?

7
Just rewrite your programs!
  • Not possible
  • Commercial application
  • Dont know how!
  • Unwilling to spend time/money to achieve
    uncertain benefits.
  • N programs M systems not scalable

8
SolutionInterposition Agent
  • An agent can solve an interface mismatch by
    converting the applications operations into
    those provided by the available system.
  • An agent can solve a resource mismatch by sending
    the applications operations to be executed
    elsewhere split execution.

9
Solution to Interface Mismatch
Application
open()
Agent
super_duper_name_lookup() super_duper_open()
Super-Duper Library
10
Solution to Resource Mismatch
Application
Via RPC
Shadow
Agent
Standard Lib
Standard Lib
Kernel
Kernel
Remote Machine
Home Machine
11
remote machine
home machine
12
Interposition Agents are an Open Research Topic
  • Several systems have been built, each with
    various strengths and weaknesses.
  • What is the appropriate mechanism?
  • What are the semantics of stacking?
  • Interesting problems result when we do impedance
    matching.

13
Split Execution is anOpen Research Topic
  • We want to explore many possibilities
  • Remote machine has some needed resources, but not
    all.
  • Data may be buffered and cached at both the agent
    and the shadow.
  • What procedure calls to trap depends on the
    application and the services needed.
  • Some procedure calls could be routed to third
    parties such as file servers.

14
Split Execution is Hard
  • One example of many Trapping stat()
  • Different data types
  • struct stat, struct stat64
  • Depending on system, integer elements are 2-gt8
    bytes
  • Multiple entry points
  • stat, _stat, __libc_stat
  • Surprises
  • define stat(a,b) _fxstat(VERSION,a,b)

15
Is this new?
  • Several previous systems have built gigantic and
    ambitious agents to virtualize the entire UNIX
    interface
  • Condor
  • MOSIX
  • GLUnix
  • Legion

16
These systems work, but...
  • They never cover all of the features.
  • e.g. memory-mapped files
  • They combine unrelated features.
  • e.g. checkpointing and remote file access
  • They are difficult to customize to new classes of
    applications
  • e.g. ORCA needs network access, remote stdio, but
    not full remote file access.

17
Our Vision
  • We want...
  • to create agents in a language independent of
    their (ugly) implementation.
  • to create simple agents that are small enough to
    be understood and debugged.
  • to compose simple agents together into larger
    agents that do no more (and no less) than what is
    needed.

18
Overview
  • Good news and bad news.
  • Our solution Bypass
  • Three simple (but useful) examples
  • Problems
  • Impedance Matching
  • Composition
  • Related and Future Work

19
Our Solution Bypass
  • Bypass takes a specification of a split execution
    system and produces a matched shadow and agent.
  • Building only an agent is a subset of this
    ability.
  • Bypass hides all of the ugly details of trapping,
    type conversion, and RPCs.

20
Bypass allows you to...
  • ...split any dynamically-linked application.
  • ...transparently use heterogeneous systems.
  • ...trap calls with minimal overhead.
  • ...control execution paths with plain C.
  • ...combine small agents in interesting ways.

21
Bypass Language
  • Declare what procedures to trap in C
  • Annotate pointer types with data flow.
  • Direction in, out, or in out
  • Binary data give expression yielding the number
    of bytes to send/receive.
  • Give two function bodies
  • agent_action
  • shadow_action

22
ssize_t write ( int fd, in "length" const void
data, size_t length ) agent_action if(
fdlt3 ) return bypass_shadow_write(fd,data,leng
th) else return write(fd,data,length)
shadow_action return write(fd,data,length)

23
Agent Action
  • Any arbitrary C code.
  • When the program invokes write(), the
    agent_action is executed at the home machine.
  • Within the agent_action
  • write() - Invoke the original write() at the
    foreign machine.
  • bypass_shadow_write() - Invoke the shadow_action
    via RPC.

24
Shadow Action
  • Any arbitrary C code.
  • If the agent decides to invoke the RPC to the
    shadow, the shadow_action is executed at the home
    machine.
  • Within the shadow_action
  • write() - Invoke write() at the home machine.

25
Using Bypass
  • Run "bypass" to read the specification and
    produce C source code
  • bypass -agent -shadow simple.bypass
  • The shadow is compiled into a plain executable.
  • The agent is compiled into a shared library.

26
Using Bypass
  • The dynamic linker is used to force the agent
    into an executable at run-time
  • setenv LD_PRELOAD simple_agent.so
  • Procedure calls are trapped merely by putting
    the agent first in the link list.
  • This method can be used on any dynamically-linked
    program tcsh, netscape, emacs

27
Shadow Features
  • Multiple configurations
  • One shadow, one agent
  • New process per incoming agent
  • New thread per incoming agent
  • Tracing of calls actually executed
  • Authentication
  • Trivial Hostname
  • Secure Globus GAA, X509 identities

28
Bypass can be used byReal Users!
  • Bypass works on unmodified executables.
  • (Real Users are not willing/able to
    rewrite/recompile their programs.)
  • Bypass requires no special privileges.
  • (Real Users do not have the root password)
  • Thus, Bypass allows a Real User to make good use
    of a remote machine without begging the
    administrator to configure it to his/her needs.

29
Performance
  • Overhead of trapping a system call is very small
    1-9 us
  • The "trapping mechanism" simply interposes a few
    extra function calls.
  • Small compared to the expense of a real system
    call (about 10-70us)
  • Remote procedure calls are, as expected, much
    slower about 1 ms under the best conditions.

30
Overview
  • Good news and bad news.
  • Our solution Bypass
  • Three simple (but useful) examples
  • Problems
  • Impedance Matching
  • Composition
  • Related and Future Work

31
Example OneRemote Console
  • Trap only read and write, and send operations on
    standard files back to a single shadow process.
  • int read( int fd, in opaque length void data,
    int length )
  • agent_action
  • if( fdlt3 )
  • bypass_remote_read( fd, data,length )
  • else return read(fd,data,length)
  • shadow_action return read(fd,data,length)

32
Remote Console
Shadow
Standard I/O reads and writes
Standard Lib
Kernel
Home Machine
33
Example TwoAttach New Filesystem
  • Trap standard I/O calls and replace them with
    calls to a user-level filesystem library, such as
    Globus GASS.
  • int open( in string const char path, int flags,
    int mode )
  • agent_action return globus_gass_open(
    path, flags, mode )
  • int close( int fd )
  • agent_action
  • return globus_gass_close( fd )

34
Application
Application attempts a plain POSIX open().
open
close
POSIX to GASS Agent
Globus GASS does a variety of system calls to
strong authentication, remote file access,
caching, etc
open
read
write
close
Standard Library Layer
35
Example ThreeInstrumentation
  • agent_prologue
  • static int bytes_read0
  • static int bytes_written0
  • int read( int fd, out opaque length void data,
    int length )
  • agent_action
  • int result
  • result read( fd, data, length)
  • if(resultgt0) bytes_read result
  • return result
  • / Definition for write is very similar /

36
Example ThreeInstrumentation Cont.
  • int exit( int status )
  • agent_action printf(NOTICE d bytes read,
    d bytes written,
  • bytes_read, bytes_written )
  • exit(status)

37
Application
write
read
exit
Measurement Agent
exit
write
read
Standard Library Layer
38
Overview
  • Good news and bad news.
  • Our solution Bypass
  • Three simple (but useful) examples
  • Problems
  • Impedance Matching
  • Composition
  • Related and Future Work

39
Problem OneImpedance Matching
  • An agent may not be able to transform operations
    from a layer above to a layer below.
  • Example Globus GASS provides an equivalent for
    open() and close(), but not for stat().

40
Possible Solutions
  • Be honest.
  • Make stat() fail not supported
  • Be evasive.
  • Find some way to serve the request indirectly.
  • Be dishonest.
  • Conjure up a complete lie about the file.

41
What to do?
  • We need not come up with a universally applicable
    solution we are building small, interchangeable
    software.
  • Consider why the application uses stat
  • to see if the file exists.
  • to test permission to access it.
  • to find out the best block size.
  • to get its size before creating a buffer.
  • to report meta-data to the user.

42
Should I be honest?
  • Cause stat() to fail not supported
  • Occasionally works!
  • If the application only needs a hint such as
    block size, it might fall back on a default.
  • Example Sometimes a big malloc() calls mmap() to
    get a new segment. If that fails, fall back on
    brk().
  • Fails in many contexts not supported is often
    interpreted as permission denied.

43
Should I be evasive?
  • Open the file, fstat() it, then close it.
  • Almost always preserves the correct semantics.
  • May break applications assumptions.
  • stat() is assumed to be quite cheap.
  • open() through GASS or other storage system may
    incur huge delays as the entire file is pulled
    in.
  • In this example, GASS caches recently used files.
    This solution is good if the application only
    stat()s files it intends to read anyway.

44
Should I be dishonest?
  • Return very permissive information
  • read/write/execute by anyone
  • block size is 4K
  • owned by you
  • file is 4GB big
  • Almost always works!
  • (Not sufficient to implement ls -l)

45
Why is dishonestythe best policy?
  • The results from stat are (almost) universally
    used as hints.
  • First check permissions, then open.
  • First check size, then read data.
  • In both cases, the situation may change, so the
    application must check for error conditions
    anyway.

46
Problem TwoComposition
  • Bypass allows agents to be composed together
    simply preload them all together.
  • How do procedure calls bind to procedure
    definitions?
  • Previous agent systems have proposed such rules,
    but do not explore their ramifications.

47
Rules of Composition
  • 1 The process maintains a pointer to an active
    layer. The topmost layer is the initial active
    layer.
  • 2 A call to a trapped procedure resolves to the
    highest definition found below the active layer.
  • 3 After resolving, but before invoking, the
    active layer is lowered to that of the callee.
    Before returning, the active layer is restored to
    that of the caller.
  • 4 Calls to untrapped procedures do not consult
    or change the active layer.

48
Practical Interpretation
  • A layer is only capable of invoking those below
    it.
  • A layer can only be invoked by those above.
  • Why?
  • Strict layering creates order from chaos.
  • Without it, measurement is not possible.

49
ExampleMeasure above GASS
Application Layer
  • Notice calls only propagate down
  • Measurement layer only traps those operations
    actually attempted by the application.

read
write
exit
Measurement Layer
open
read
write
close
exit
Standard Library Layer
50
ExampleGASS above Measure
Application Layer
  • Again calls only propagate down
  • Measurement layer catches the resources consumed
    by both layers together.

open
read
write
close
exit
Standard Library Layer
51
ExampleThird Party Function
  • printf is a third party function it is not
    trapped by a layer.
  • It contains a write, so where does it bind?
  • It binds to the layer below that of the caller.

Application Layer
printf
write
Agent Layer
write
Standard Library Layer
52
Others Have Chosen Different Rules
  • Mediating Connectors
  • Layer may invoke either the layer below, or start
    again at the topmost.
  • Disjoint layers may commute.
  • We disagree
  • If you can re-invoke at top, it is not possible
    to build a sensible measuring agent.
  • Careful with disjoint GASS and measurement
    layers appear to be disjoint, but they do not
    commute.

53
A Layered Remote Execution System
Application
Measurement
Via RPC
POSIX to GASS
Shadow
Remote I/O
Measurement
Standard Lib
Standard Lib
Kernel
Kernel
Remote Machine
Home Machine
54
Overview
  • Good news and bad news.
  • Our solution Bypass
  • Three simple (but useful) examples
  • Problems
  • Impedance Matching
  • Composition
  • Related and Future Work

55
Related Work
  • Classic RPC and XDR
  • Define standard integer sizes, endianness, etc.
  • Start by defining external protocol, then produce
    programming interface which is not always
    convenient
  • struct read_results read_1( int fd, int length
    )

56
Related Work
  • Bypass
  • We are stuck with existing interfaces, so
    annotate them to produce a protocol
  • int read( int fd, out opaque length void data,
    int length )
  • Do best effort conversion to/from external data
    format
  • off_t is 4 bytes on some platforms, 8 bytes on
    others.
  • A conversion might fail!
  • Define canonical values for source-level symbols
  • O_CREAT has different values on Linux and Solaris!

57
Related Work
  • Hunt and Brubacher, Detours
  • Trap library calls on NT using binary rewriting
    can be applied to any executable.
  • Make original procedure available through special
    trampoline call.
  • Bypass leaves the original entry point intact, so
    subroutines need not be re-written to use the
    trampoline.

58
Related Work
  • Alexandrov, et al., UFO
  • Use a kernel-level facility to trap all of a
    process system calls and translate some of them
    into WWW operations.
  • The kernel mechanism is secure and can be applied
    to any process.
  • But it has a high (7x) trapping overhead and
    cannot be applied to procedures that are not true
    system calls.

59
Related Work
  • Bypass
  • Trapping overhead is very small and can be
    performed on procedures that are not necessarily
    system calls.
  • But can only be applied to dynamically-linked
    executables, and is not suitable as a security
    mechanism.

60
Related/Future Work
  • A complete remote execution system needs both
    methods
  • The program owner provides a lightweight
    mechanism for creating a correct split execution
    environment.
  • The machine owner provides a heavyweight
    mechanism to defend itself from a (possibly)
    malicious program.

61
Complete System
Application
Via RPC
Agent
Shadow
Standard Lib
Standard Lib
Sandbox
Kernel
Kernel
Remote Machine
Home Machine
62
Our Contributions
  • A language for writing agents
  • Independent of implementation mechanism.
  • Correct mechanism depends on purpose.
  • Implicit binding
  • Agents name procedures, not other agents.
  • Original procedure entry point preserved.
  • Composition rules
  • Strict layering makes order from chaos.

63
Future Work
  • Interaction of sandbox and utility agents
  • A utility agent modified the applications
    operations to make them acceptable to the
    sandbox.
  • Should they negotiate on permitted operations?
  • Signal handling
  • How to specify? (Many relevant functions)
  • Flow of control is backwards
  • Other implementations
  • Binary rewriting.
  • Build specialized linker that understands
    multiple definitions of symbols.

64
Further Questions?
  • Douglas Thain
  • thain_at_cs.wisc.edu
  • Miron Livny
  • miron_at_cs.wisc.edu
  • Bypass Web Page
  • http//www.cs.wisc.edu/condor/bypass
  • Questions now?
Write a Comment
User Comments (0)
About PowerShow.com