Microkernels Liedtke Papers - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

Microkernels Liedtke Papers

Description:

In this sense, they can be seen as a hardware abstraction layer. Previous microkernels perform ... Servers can't stomp on each other. Principle of integrity ... – PowerPoint PPT presentation

Number of Views:62
Avg rating:3.0/5.0
Slides: 39
Provided by: csBing
Category:

less

Transcript and Presenter's Notes

Title: Microkernels Liedtke Papers


1
Microkernels(Liedtke Papers)
  • Kenneth Chiu

2
Thesis?
  • Microkernels can have adequate performance if
    they are architecture-dependent.
  • They increase portability, though they themselves
    are non-portable. (Why non-portable faster?)
  • In this sense, they can be seen as a hardware
    abstraction layer.
  • Previous microkernels perform poorly because they
    derive from monolithic kernels. (Why does this
    make a difference?)

3
Previous Critiques
  • A number of previous critiques have shown
    microkernels to be inefficient.
  • However, they measure performance without
    analyzing the reason. (Why is this bad?)
  • Liedtke contends that this is a crucial error.
  • The question is is it inherent, or is it the
    implementation?

4
Inherent or Implementation?
  • Issue pervades much of CS research involving
    performance.
  • Is XML inherently slow?
  • How can it be ameliorated?
  • What about algorithm theory?
  • Constant factors matter in the real world.
  • What areas or kinds of papers is it not a
    problem?
  • Abstraction
  • Design

5
What Goes Into the Kernel?
  • Only determined by function, not performance.
  • Key issue is protection.
  • Principle of independence
  • Servers cant stomp on each other.
  • Principle of integrity
  • Communication channels cant be interfered with.
  • Usually the following go into the kernel
  • address spaces (Why?)
  • IPC (Why?)
  • basic scheduling (Why?)
  • Distinguish between kernel and trusted
  • Can you trust a user-level process?
  • Can you have untrusted kernel code?
  • Suppose you had an architecture that did not
    allow I/O except in supervisor mode. Could you
    put device drivers in the kernel?
  • Radical differences in hardware may have
    pervasive impacts.

6
Concepts
7
Address Spaces
  • An AS s is a mapping, from virtual addresses to
    either other virtual addresses or real addresses.
  • Initial AS is defined as s0 V ? R ? ?
  • Further ones are defined as s V ? (? ? V) ?
    ?. (Why include AS?)
  • Ability to map to other virtual addresses makes
    them recursive.
  • Interpret as table sv is element of table.

8
Address Spaces (diag)
r
9
AS Operations
  • ? means
  • flush(s,v) sv ?
  • grant
  • Gives page directly to grantee. Original mapping
    is gone.
  • map
  • Creates a (soft) link. Maybe revoked at any time.
  • flush defined recursively
  • Does it terminate?

10
Implementation Efficiency?
  • Physical address never changes. (Why?)

11
I/O
  • Recall that I/O can be done either with special
    ports, or memory-mapped.
  • Incorporated into the address space
  • Natural for memory-mapped I/O
  • PowerPC
  • Also works on I/O ports
  • x86 permits control per port, but no mapping
  • Dont confuse memory-mapped I/O with
    memory-mapped files.

12
Threads
  • AS is associated with a thread.
  • How is this different from previous definition of
    process?
  • Changes to AS handled by the kernel. (Why?)

13
Interrupts
  • Hardware is a set of special threads.
  • Interrupts modeled as empty messages.
  • driver thread
  • do
  • wait for (msg, sender)
  • if sender my header interrupt
  • then read/write IO ports
  • reset hardware interrupt
  • else
  • fi
  • od
  • Not sure why need to check sender.
  • If interrupt cannot be reset from user-mode,
    kernel will do it automagically as a special case.

14
Flexibility
15
Memory Manager
  • Server managing the initial AS s0 is classical
    main memory manager.
  • Memory managers can be stacked
  • M0 maps or grants one part to M1, other parts to
    M2.
  • What might be the benefit of a user-level memory
    manager?

16
Pager
  • Good for hard/soft real-time. (Why?)
  • Prefetching
  • Pinning
  • Application-level pagers. (Why useful?)
  • Profile directed optimization
  • How does it affect security?
  • Remember that user-level can be trusted, also.

17
Device Driver
  • Last week we said that I/O should be in
    supervisor mode, why are we changing that?
  • Windows XP stability

18
L2 and TLB
  • Handling misses
  • Improve hit ratios
  • Why first-level TLB in kernel?
  • Is this a violation of making decisions based on
    function only, not performance?

19
Remote Communications
  • Can be done outside of kernel.
  • If desire zero-copy, can make communication
    server a special pager for the client.

20
UNIX Server
  • Kernel interface implemented by server.
  • Similar to Windows NT/2K/XP.
  • NT is actually supposed to be POSIX compliant.

21
Performance
22
Cache
  • Physical memory viewed as a sequence of bytes.
  • RAM is slow, so we use a cache.
  • Unit of caching is a cache line.
  • Direct mapped.
  • Fully associative.
  • Set associative.
  • Victim cache.
  • Misses
  • Compulsory
  • Capacity
  • Conflict

23
Direct Mapped
8K
4K
0
24
Fully Associative
25
Set Associative
8K
4K
0
What is 1-way set associative?
26
Victim Cache
4K
0
Victim Cache
Why might this be better?
27
Paging
  • Virtual address space is larger than RAM.
  • Use tables to map from VA to physical address.
  • Tables are big, and so are also in RAM.
  • RAM is slow. Solution?
  • Yet another kind of caching TLB.
  • Untagged TLBs just map VA to physical pages.
  • Tagged TLBs have an additional ID.
  • TLBs may also be fully associative, etc.

28
Kernel-User Switches
  • Chen and Bershad measured 18 microseconds per
    Mach call.
  • Using 486, 50 Mhz, about 900 cycles.
  • Bare machine instruction is 71 cycles, followed
    by 36 cycles for returning to user mode.
  • Switch stacks
  • Push/pop flag register and PC
  • Thus 107 cycles is lower bound. What is going on
    the other 800 cycles? Is it necessary?
  • L3 has 15-57 cycle overhead. (Why variation?)
  • What is the conclusion? Inherent or
    implementation?

29
Address Space Switches
  • Cache
  • Most are physical, so do not need to be
    invalidated. (Why?)
  • Page table switching
  • Fast, usually just a register.
  • TLB
  • Untagged, need to flush. (Why?)
  • Tagged, no need to flush so fast.

30
AS Optimizations
  • Most processors use untagged TLBs. How can we
    avoid flushing?
  • Whats the problem here?
  • PowerPC, huge 252 logical address space, with
    segments.
  • Just use segment register to distinguish ASs.
  • Switch is just reloading the segment register.
  • Pentium, 32-bit space.
  • Make ASs share this space.
  • Need some kind of management policy. (Put outside
    of kernel?)
  • They suggest swapping out big ones, but keep
    together small ones.

31
Thread Switches/IPC
  • Essentially depends on user-kernel and address
    space switching.
  • Arch dependent can be much faster, 10-80.
  • Conclusion is that its fast enough for
    interrupts.

32
Memory Effects
  • Chen and Bershad measured MCPI, comparing Mach to
    Ultrix.
  • Found Ultrix significantly better.
  • Concluded that improving IPC would not help
    microkernels.
  • Categorized according to type of overhead.
  • System cache misses, and everything else.
  • Found that system self-interference due to
    capacity misses is the greatest problem.
  • What does this mean? If it was conflict, what are
    the implications?
  • So kernel is using too much cache. Two
    possibilities
  • Many routines being used a lot.
  • A few routines using a lot of cache.
  • Design principle precludes first. (Why?) Nothing
    inherent about the second, based on L3 as an
    example.

33
Nonportability
34
Non-Portability
  • Microkernels are the lowest layer. We should
    accept that they are non-portable.
  • Not only superficially non-portable, but even the
    algorithms, data structures, etc.

35
Pentium/486 AS Switching
  • For 486, better to just switch page table and
    flush the cache.
  • For Pentium, better to switch segment register
  • Segment register loads are faster (3 vs. 9)
  • Reduced associativity (4 way vs. 2 way)
  • TLB misses result in more cache misses. (why?)
  • Much bigger TLB (3 times) So?
  • In all, half of kernel modules affected.

36
Pentium/486 IPC
  • Pentium cache is 2-way, 486 is 4-way
  • IPC accesses both TCBs and kernel stack
  • TCB are page aligned
  • Corresponding data from different TCBs map to the
    same cache line.
  • Okay on 486 since it is 4-way
  • Not okay on Pentium, since it is only 2-way
  • Solution is to align on 1K boundaries.
  • 75 chance that two selected TCBs dont compete.
    (Why?)
  • Lucky in this case since Pentium optimization is
    benign on 486.

37
Incompat Processors
  • Should expect isolated timing differences not
    related to overall processor performance. (Why?)
  • Different architectures require
    processor-specific optimization techniques that
    even affect the global microkernel structure.
  • Mandatory 486 TLB flush requires minimizing
    subsequent TLB misses.
  • Put heavily used stuff in one page, dont spread
    across pages (no unmapped memory).
  • Lazy scheduling (temporal grouping of accesses).
  • IPC requires moving two threads to/from queues.
    Solution?
  • R2000 unmapped memory and tagged TLB. (Why
    unmapped memory helps?)

38
Conclusions?
  • Do you buy it?
  • Research OSs have a hard time moving into the
    real world.
Write a Comment
User Comments (0)
About PowerShow.com