PPT – On PowerPoint presentation | free to view

About This Presentation

Title:

On

Description:

Micro-kernels, Presented by Arun Krishnamurthy, COP 5611, University of Central Florida ... Servers can't stomp on each other. Principle of integrity ... – PowerPoint PPT presentation

Number of Views:84

Avg rating:3.0/5.0

Slides: 59

Provided by: ssrnet

Category:

Tags: stomp

more less

Transcript and Presenter's Notes

Title: On

1
On µ-Kernel Construction

Jochen Liedtke
15th ACM Symposium on Operating System Principles
(1995)
Presented by ???

2
References

Micro-kernels, Presented by Arun Krishnamurthy,
COP 5611, University of Central Florida
On µ-Kernel Construction, Jochen Liedtke, 15th
ACM Symposium on Operating System Principles
(1995), Presented by ???
Microkernels(Liedtke Papers), Kenneth Chiu

3
Abstract

µ-kernel is superior to large integrated kernels
from a software-technology point of view
But it is believed that µ-kernel is not efficient
and not flexible
This paper claims that µ-kernel is efficient and
flexible
And presents some concepts which must be
implemented by µ-kernel

4
Contents

Definition of Kernel/microkernel
Software technological advantages
µ-kernel concepts
Flexibility
Performance switching overhead
Performance memory effects
Non-portability
Conclusion

5
Definition of Kernel

The fundamental part of an Operating System.
Responsible for providing secure access to the
machines hardware for various programs.
Responsible for deciding when and how long a
program can use a certain hardware
(multiplexing).
Source Wikipedia.org

6
Definition of Kernel
Figure 1 Diagram of Linux Kernel
7
Definition of Microkernel

A kernel technique that provides only the minimum
OS services.
Address Spacing
Inter-process Communication (IPC)
Thread Management
Unique Identifiers
All other services are done independently.

8
Definition of Microkernel
Figure 2 Diagram of Microkernel
9
Software technological advantages

A clear µ-kernel interface enforces a more
modular system structure
Servers can use the mechanisms provided by the
µ-kernel like any other user program.
So server malfunction is as isolated as any other
user programs malfunction
The system is more flexible and tailorable.
Different strategies and APIs, implemented by
different severs, can coexist in the system

10
Chorus Architecture
Figure 8 Chorus Architecture
11
Chorus Nucleus

Supervisor
Dispatches traps, interrupts, and exceptions
delivered by hardware.
Real Time Executive
Controls allocation of processes and provides
pre-emptive scheduling
Virtual Memory Manager
Manipulates VM hardware and memory resources.
IPC
Provides message Exchanging and Remote Procedure
Calls (RPC).

12
Chorus Nucleus
Figure 9 The Chorus Nucleus
13
Thesis?

Microkernels can have adequate performance if
they are architecture-dependent.
They increase portability, though they themselves
are non-portable. (Why non-portable faster?)
In this sense, they can be seen as a hardware
abstraction layer.
Previous microkernels perform poorly because they
derive from monolithic kernels. (Why does this
make a difference?)

14
Previous Critiques

A number of previous critiques have shown
microkernels to be inefficient.
However, they measure performance without
analyzing the reason. (Why is this bad?)
Liedtke contends that this is a crucial error.
The question is is it inherent, or is it the
implementation?

15
What Goes Into the Kernel?

Only determined by function, not performance.
Key issue is protection.
Principle of independence
Servers cant stomp on each other.
Principle of integrity
Communication channels cant be interfered with.
Usually the following go into the kernel
address spaces (Why?)
IPC (Why?)
basic scheduling (Why?)
Distinguish between kernel and trusted
Can you trust a user-level process?
Can you have untrusted kernel code?
Suppose you had an architecture that did not
allow I/O except in supervisor mode. Could you
put device drivers in the kernel?
Radical differences in hardware may have
pervasive impacts.

16
Concepts
17
Address spaces

The µ-kernel has to hide the hardware concept of
address spaces, since otherwise, implementing
protection would be impossible
The µ-kernel concept of address spaces must be
tamed, but must permit the implementation of
arbitrary protection schemes on top of the
µ-kernel

18
Address spaces(Cont)

µ-kernel just provide mechanism to support
recursive construction of address spaces outside
the kernel
µ-kernel provides three operations for this
Grant
The owner of an address space can grant any of
its pages to another space
Map
The owner of an address space can map any of its
pages to another space
Flush
The owner of an address space can flush any of
its pages

19
Address spaces(Cont)
20
I/O

Recall that I/O can be done either with special
ports, or memory-mapped.
Incorporated into the address space
Natural for memory-mapped I/O
PowerPC
Also works on I/O ports
x86 permits control per port, but no mapping
Dont confuse memory-mapped I/O with
memory-mapped files.

21
Threads

µ-kernel must include the thread concept
Thread accesses memory, µ-kernel manage memory
AS is associated with a thread.
How is this different from previous definition of
process?
Changes to AS handled by the kernel. (Why?)

22
IPC

IPC always enforces a certain agreement between
both parties of communication
IPC is not only the basic concept for
communication between subsystems but also,
together with address spaces, the foundation of
independence
Note that grant and map operations need IPC

23
Interrupts

The natural abstraction for hardware interrupt is
the IPC message
The hardware is regarded as a set of threads
which have special thread ids and send empty
messages (only consisting of the sender id)
Interrupts modeled as empty messages.
driver thread
do
wait for (msg, sender)
if sender my hardware interrupt
then read/write IO ports
reset hardware interrupt
else
fi
od
If interrupt cannot be reset from user-mode,
kernel will do it automagically as a special case.

24
Unique identifiers

A µ-kernel must supply unique identifiers for
something, either for threads of tasks or
communication channels

25
Flexibility
26
Flexibility

On top of the µ-kernel, various OS
functionalities can be implemented
Some examples
Memory manager
Pager
Real-time resource allocation
Device driver
Second level cache and TLB
Remote communication
Unix server

27
Memory Manager

Server managing the initial AS s0 is classical
main memory manager.
Memory managers can be stacked
M0 maps or grants one part to M1, other parts to
M2.
What might be the benefit of a user-level memory
manager?

28
Pager

Good for hard/soft real-time. (Why?)
Prefetching
Pinning
Application-level pagers. (Why useful?)
Profile directed optimization
How does it affect security?
Remember that user-level can be trusted, also.

29
Device Driver

Usually I/O should be in supervisor mode, why are
we changing that?
Windows XP stability

30
L2 and TLB

Handling misses
Improve hit ratios
Why first-level TLB in kernel?
Is this a violation of making decisions based on
function only, not performance?

31
Remote Communications

Can be done outside of kernel.
If desire zero-copy, can make communication
server a special pager for the client.

32
UNIX Server

Kernel interface implemented by server.
Similar to Windows NT/2K/XP.
NT is actually supposed to be POSIX compliant.

33
Performance
34
Kernel-User Switches

Some papers reported high overheads
These results comprise excessive kernel call
overhead
The additional overheads of µ-kernel is frequent
entering and exiting the kernel
Calling the kernel from user mode is simply an
indirect call, complemented by a stack switch and
setting the internal kernel-bit to permit
privileged operations

35
Kernel-User Switches

Chen and Bershad measured 18 microseconds per
Mach call.
Using 486, 50 Mhz, about 900 cycles.
Bare machine instruction is 71 cycles, followed
by 36 cycles for returning to user mode.
Switch stacks
Push/pop flag register and PC
Thus 107 cycles is lower bound. What is going on
the other 800 cycles? Is it necessary?
L3 has 15-57 cycle overhead. (Why variation?)
What is the conclusion? Inherent or
implementation?

36
Kernel-User Switches(Cont)

Returning from kernel mode is a normal return
operation complemented by switching back to user
stack and resetting the kernel-bit
And for the different stack pointer registers for
user and kernel stack, the stack switching costs
can be hidden
So efficient implementation reduces it
sufficiently
Kernel-user mode switches are not a serious
conceptual problem but an implementational one

37
Paging

Virtual address space is larger than RAM.
Use tables to map from VA to physical address.
Tables are big, and so are also in RAM.
RAM is slow. Solution?
Yet another kind of caching TLB.
Untagged TLBs just map VA to physical pages.
Tagged TLBs have an additional ID.
TLBs may also be fully associative, etc.

38
Address Space Switches

Cache
Most are physical, so do not need to be
invalidated. (Why?)
Page table switching
Fast, usually just a register.
TLB
Untagged, need to flush. (Why?)
Tagged, no need to flush so fast.

39
AS Optimizations

Most processors use untagged TLBs. How can we
avoid flushing?
Whats the problem here?
PowerPC, huge 252 logical address space, with
segments.
Just use segment register to distinguish ASs.
Switch is just reloading the segment register.
we can control TLB flush and make it like tagged
TLB
Pentium, 32-bit space.
Make ASs share this space.
Need some kind of management policy. (Put outside
of kernel?)
They suggest swapping out big ones, but keep
together small ones.

40
Address Space Switches(Cont)

Folklore also considers address-space switches as
costly
The real costs of address space switching is
related to the TLB architecture
If the TLB architecture require to flush whenever
address space switches, it can be a critical
problem
Fortunately on PowerPC we can control TLB flush
and make it like tagged TLB

41
Address Space Switches(Cont)

But things are not quite easy on the Pentium or
the 486
So some technique is needed and there are some
overheads

42
Address Space Switches(Cont)

Properly constructed address-space switches are
not very expensive
Expensive context switching in some case is due
to implementation and not caused by inherent
problems with the µ-kernel concept

43
Thread Switches/IPC(Cont)

Ousterhout1990 measured context switching in
some Unix systems by echoing one byte back and
forth though pipes between two processes
Most results are between 400 and 800 µs
All existing µ-kernel are at least 2 times faster
But it is proved that a 40 to 80 times faster RPC
is achievable

44
Thread Switches/IPC(Cont)

IPC can be implemented fast enough to handle also
hardware interrupts by this mechanism

45
Memory Effects

Chen and Bershad measured MCPI, comparing Mach to
Ultrix.
Found Ultrix significantly better.
Concluded that improving only IPC would not help
microkernels.
Categorized according to type of overhead.
System cache misses, and everything else.
Found that system self-interference due to cache
capacity misses is the greatest problem.
What does this mean? If it was conflict, what are
the implications?
So kernel is using too much cache. Two
possibilities
Many routines being used a lot.
A few routines using a lot of cache.
Design principle precludes first. (Why?) Nothing
inherent about the second, based on L3 as an
example.

MCPI memory cycle overhead per instruction
46
Performance memory effects

Some significant memory degradation is reported
But it is caused solely by high cache consumption
of the µ-kernel
In cache generally,
if the working set is too large, then fewer
processes can be ready at any one time.
If the working set is too small, then additional
requests must be made of the swapping space to
retrieve required pages.

47
Performance memory effects (Cont)

By large cache working set cache miss ratio is
increased
That is, the memory degradation is not a
conceptual problem of µ-kernel
Properly constructed µ-kernel will avoid the
memory degradation

48
Nonportability
49
Non-portability

µ-kernel structure is influenced by processor
specific feature, especially by memory model
µ-kernel form the link between a minimal µ-set
of abstractions and the bare processor
µ-kernel are inherently not portable
They are the processor dependent basis for
portable operating systems

50
Non-Portability

Microkernels are the lowest layer. We should
accept that they are non-portable.
Not only superficially non-portable, but even the
algorithms, data structures, etc.

51
Pentium/486 AS Switching

For 486, better to just switch page table and
flush the cache.
For Pentium, better to switch segment register
Segment register loads are faster (3 vs. 9)
Reduced associativity (4 way vs. 2 way)
TLB misses result in more cache misses. (why?)
Much bigger TLB (3 times) So?
In all, half of kernel modules affected.

52
Pentium/486 IPC

Pentium cache is 2-way, 486 is 4-way
IPC accesses both TCBs and kernel stack
TCB are page aligned
Corresponding data from different TCBs map to the
same cache line.
Okay on 486 since it is 4-way
Not okay on Pentium, since it is only 2-way
Solution is to align on 1K boundaries.
75 chance that two selected TCBs dont compete.
(Why?)
Lucky in this case since Pentium optimization is
benign on 486.

53
Incompatible Processors

Should expect isolated timing differences not
related to overall processor performance. (Why?)
Different architectures require
processor-specific optimization techniques that
even affect the global microkernel structure.
Mandatory 486 TLB flush requires minimizing
subsequent TLB misses.
Put heavily used stuff in one page, dont spread
across pages (no unmapped memory).
Lazy scheduling (temporal grouping of accesses).
IPC requires moving two threads to/from queues.
Solution?
R2000 unmapped memory and tagged TLB. (Why
unmapped memory helps?)

54
L4 Microkernel

Developed by Jochen Liedtke in 1995.
German National Research Center for IT
Assumed that micro-kernels were processor
dependent.
Developed from scratch!!!

55
L4 Abstractions

Address Spaces
Map, Grant, Unmap (Flush)
Threads
IPC
Short message passing
Copying Large Data Messages
Lazy Scheduling

56
L4 Abstractions(IPC)

Passing Short Messages
Transfers short IPC messages in registers.
Copying Large Data Messages
Allow single-copy transfers by sharing the target
region with the sender.
Lazy Scheduling
Delay movement between threads until queue is
queried.

57
Conclusion 1

µ-kernel can provide higher layers with a minimal
set of appropriate abstractions
µ-kernels must be constructed per processor and
are inherently not portable
It is possible to achieve well performing
µ-kernels through processor specific
implementations of processor-independent
abstractions