The Performance of MicrokernelBased Systems

About This Presentation

Title:

The Performance of MicrokernelBased Systems

Description:

user level 'trampoline' exception ... Trampoline ... the trampoline 'bounces' the system-call trap that on native linux went into the ... – PowerPoint PPT presentation

Number of Views:44

Avg rating:3.0/5.0

Slides: 26

Provided by: jegr

Learn more at: http://web.cecs.pdx.edu

Category:

more less

Transcript and Presenter's Notes

Title: The Performance of MicrokernelBased Systems

1
The Performance of Microkernel-Based Systems

L4Linux

2
What is a microkernel

kernels that only provide address spaces,
threads, and IPC
kernel does not handle e.g. the file system or
interrupts

3
Mircokernel abstraction level

some researchers feel abstraction level is too
high
kernel should map more directly to hardware
some researchers feel abstraction is too low
focus on extensibility (Mach)

4
L4

so called 2nd generation microkernel
built from scratch as opposed to a developed from
earlier monolithic kernel approaches (e.g. Mach)

5
L4 essentials

threads, address spaces and cross-address-space
communication (IPC)
other kernel operations e.g. RPC and
address-space crossing thread migration are built
up from IPC primitives

6
Address spaces

recursive construction
granting, mapping, unmapping
a page owner can grant or map any of its pages to
another address space with the receiver
permission. That page is then accessible to both
address spaces

7
Address spaces (cont)

only the grant, map, and unmap are implemented in
the kernel
user-level pagers handle page faults

8
Interrupts, exceptions, and traps

all handled at user level
interrupts are transformed, by kernel, into IPC
messages and sent to appropriate user level
thread
exceptions and traps are synchronous to
associated thread and kernel mirrors them to that
thread

9
Implementation of L4Linux

develop L4Linux, a linux personality on top of
the L4 microkernel
due to time restrictions the linux kernel was not
fine tuned in L4Linux, so results are only an
upper bound on the performance penalty

10
L4Linux

linux 2.0.21 on top of L4
linux kernel is a user level server
100 binary compatible
modified versions of shared C library libc.so and
libc.a
user level trampoline exception
14 engineer months and 6500 rewritten lines out
of a total of 340,000

11
Trampoline

100 binary compatible means that a program
statically linked against the native linux
library must run, unmodified, on L4Linux
the trampoline bounces the system-call trap
that on native linux went into the kernel back
into the modified shared library on L4Linux
Microkernel upcalls into user level handler,
handler than makes an RPC (read, invokes kernel
again) to OS personality to invoke system call

12
L4Linux (cont)

L4 maps the entire initial address space to
kernel server
single thread in L4, acts as a single virtual
processor to the linux server
Linux server occupies a small memory region,
which utilizes Pentiums segment feature to
protect its TLB entries, so the TLB always has
the linux servers translations
(small-address-space optimization)

13
L4Linux (cont)

L4 allows user level processes to disable
interrupts, so uniprocessor version of linux did
not need modification of critical sections

14
L4Linux (cont)

interrupt threads have a priority above the
server itself, so they dont execute concurrently
signals are forwarded to a co-located signal
handler inside each user process, since only a
thread in the same address space can manipulate
another threads state

15
L4Linux (cont)

scheduling is mostly done by L4 scheduler
Four priority levels top half interrupts, bottom
half interrupts, the linux server, user
processes. No priority decay.
so L4 interrupts the linux server in the same way
the hardware would interrupt a native linux kernel

16
L4Linux (cont)

user level schedulers can dynamically change
priority and time slice of any thread?

17
Experiments

micro- and macro- benchmarks used to compare
native linux and MkLinux (Mach derived variant)
to L4Linux
linux vs L4Linux demonstrates performance penalty
for using microkernel
L4Linux vs MkLinux demonstrates influence of
mircokernel on overall system including the
influence of colocation
extensibility experiments
functionality specialized for L4Linux

18
PerformanceL4Linux, MkLinux and Linux

microbenchmarks
getpid L4Linux 2.4 or 3.4 times slower than
linux MkLinux 3.9 or 28 times slower than
L4Linux
lmbench and hbench L4Linux 1 to 3 times slower
than linux MkLinux 1 to 32 times slower than
L4Linux

19
Performance (cont)L4Linux, MkLinux and Linux

macrobenchmarks
recompiling linux server L4Linux 6-7 slower
than linux MkLinux 10-20 slower than L4Linux
AIM multiuser benchmark suite job throughput in
L4Linux is 7-8 lower than linux MkLinux is
30-52 lower than L4Linux

20
Conclusions

At application level there is a 5-10
performance penalty for using L4Linux vs bare
linux
The particular microkernel used matters
Colocation it secondary to microkernel
implementation

21
Extensibility

Can we add services outside L4Linux to improve
performance by specializing Unix functionality
Can we improve certain applications by using
native microkernel mechanisms in addition to the
classical API
Can we achieve high performance for
non-classical, Unix-incompatible systems
coexisting with L4Linux

22
Pipes and RPC
23
Virtual Memory

measure user level page fault that maps a page
from one address space to another (not available
on Unix)
measured traps and two different trap, protect,
unprotect patterns which performed on average 4
times faster than native linux

24
Cache Partitioning

User level main-memory manager can coordinate
with L4 to allocate specific L2 cache pages to
certain processor
matrix multiplication example with a four times
speed-up of worst case performance