The Performance of MicrokernelBased Systems - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

The Performance of MicrokernelBased Systems

Description:

user level 'trampoline' exception ... Trampoline ... the trampoline 'bounces' the system-call trap that on native linux went into the ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 26
Provided by: jegr
Learn more at: http://web.cecs.pdx.edu
Category:

less

Transcript and Presenter's Notes

Title: The Performance of MicrokernelBased Systems


1
The Performance of Microkernel-Based Systems
  • L4Linux

2
What is a microkernel
  • kernels that only provide address spaces,
    threads, and IPC
  • kernel does not handle e.g. the file system or
    interrupts

3
Mircokernel abstraction level
  • some researchers feel abstraction level is too
    high
  • kernel should map more directly to hardware
  • some researchers feel abstraction is too low
  • focus on extensibility (Mach)

4
L4
  • so called 2nd generation microkernel
  • built from scratch as opposed to a developed from
    earlier monolithic kernel approaches (e.g. Mach)

5
L4 essentials
  • threads, address spaces and cross-address-space
    communication (IPC)
  • other kernel operations e.g. RPC and
    address-space crossing thread migration are built
    up from IPC primitives

6
Address spaces
  • recursive construction
  • granting, mapping, unmapping
  • a page owner can grant or map any of its pages to
    another address space with the receiver
    permission. That page is then accessible to both
    address spaces

7
Address spaces (cont)
  • only the grant, map, and unmap are implemented in
    the kernel
  • user-level pagers handle page faults

8
Interrupts, exceptions, and traps
  • all handled at user level
  • interrupts are transformed, by kernel, into IPC
    messages and sent to appropriate user level
    thread
  • exceptions and traps are synchronous to
    associated thread and kernel mirrors them to that
    thread

9
Implementation of L4Linux
  • develop L4Linux, a linux personality on top of
    the L4 microkernel
  • due to time restrictions the linux kernel was not
    fine tuned in L4Linux, so results are only an
    upper bound on the performance penalty

10
L4Linux
  • linux 2.0.21 on top of L4
  • linux kernel is a user level server
  • 100 binary compatible
  • modified versions of shared C library libc.so and
    libc.a
  • user level trampoline exception
  • 14 engineer months and 6500 rewritten lines out
    of a total of 340,000

11
Trampoline
  • 100 binary compatible means that a program
    statically linked against the native linux
    library must run, unmodified, on L4Linux
  • the trampoline bounces the system-call trap
    that on native linux went into the kernel back
    into the modified shared library on L4Linux
  • Microkernel upcalls into user level handler,
    handler than makes an RPC (read, invokes kernel
    again) to OS personality to invoke system call

12
L4Linux (cont)
  • L4 maps the entire initial address space to
    kernel server
  • single thread in L4, acts as a single virtual
    processor to the linux server
  • Linux server occupies a small memory region,
    which utilizes Pentiums segment feature to
    protect its TLB entries, so the TLB always has
    the linux servers translations
    (small-address-space optimization)

13
L4Linux (cont)
  • L4 allows user level processes to disable
    interrupts, so uniprocessor version of linux did
    not need modification of critical sections

14
L4Linux (cont)
  • interrupt threads have a priority above the
    server itself, so they dont execute concurrently
  • signals are forwarded to a co-located signal
    handler inside each user process, since only a
    thread in the same address space can manipulate
    another threads state

15
L4Linux (cont)
  • scheduling is mostly done by L4 scheduler
  • Four priority levels top half interrupts, bottom
    half interrupts, the linux server, user
    processes. No priority decay.
  • so L4 interrupts the linux server in the same way
    the hardware would interrupt a native linux kernel

16
L4Linux (cont)
  • user level schedulers can dynamically change
    priority and time slice of any thread?

17
Experiments
  • micro- and macro- benchmarks used to compare
    native linux and MkLinux (Mach derived variant)
    to L4Linux
  • linux vs L4Linux demonstrates performance penalty
    for using microkernel
  • L4Linux vs MkLinux demonstrates influence of
    mircokernel on overall system including the
    influence of colocation
  • extensibility experiments
  • functionality specialized for L4Linux

18
PerformanceL4Linux, MkLinux and Linux
  • microbenchmarks
  • getpid L4Linux 2.4 or 3.4 times slower than
    linux MkLinux 3.9 or 28 times slower than
    L4Linux
  • lmbench and hbench L4Linux 1 to 3 times slower
    than linux MkLinux 1 to 32 times slower than
    L4Linux

19
Performance (cont)L4Linux, MkLinux and Linux
  • macrobenchmarks
  • recompiling linux server L4Linux 6-7 slower
    than linux MkLinux 10-20 slower than L4Linux
  • AIM multiuser benchmark suite job throughput in
    L4Linux is 7-8 lower than linux MkLinux is
    30-52 lower than L4Linux

20
Conclusions
  • At application level there is a 5-10
    performance penalty for using L4Linux vs bare
    linux
  • The particular microkernel used matters
  • Colocation it secondary to microkernel
    implementation

21
Extensibility
  • Can we add services outside L4Linux to improve
    performance by specializing Unix functionality
  • Can we improve certain applications by using
    native microkernel mechanisms in addition to the
    classical API
  • Can we achieve high performance for
    non-classical, Unix-incompatible systems
    coexisting with L4Linux

22
Pipes and RPC
23
Virtual Memory
  • measure user level page fault that maps a page
    from one address space to another (not available
    on Unix)
  • measured traps and two different trap, protect,
    unprotect patterns which performed on average 4
    times faster than native linux

24
Cache Partitioning
  • User level main-memory manager can coordinate
    with L4 to allocate specific L2 cache pages to
    certain processor
  • matrix multiplication example with a four times
    speed-up of worst case performance

25
Possible Alternatives
  • Protected Control Transfers
  • Grafting
Write a Comment
User Comments (0)
About PowerShow.com