I would hate user space locking if it weren't that sexy... fusyn RTNPTL: the making of a realtime sy - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

I would hate user space locking if it weren't that sexy... fusyn RTNPTL: the making of a realtime sy

Description:

fusyn RTNPTL: the making of a real-time synchronization. infrastructure for Linux. http: ... fulocks: chubby kernel mutexes. I would hate user space locking... – PowerPoint PPT presentation

Number of Views:103
Avg rating:3.0/5.0
Slides: 35
Provided by: devresour
Category:

less

Transcript and Presenter's Notes

Title: I would hate user space locking if it weren't that sexy... fusyn RTNPTL: the making of a realtime sy


1
I would hate user space locking if it weren't
that sexy...fusynRTNPTL the making of a
real-time synchronization infrastructure for
Linuxhttp//developer.osdl.org/dev/robustmutexe
sInaky Perez-Gonzalez tel.com Boris Hu Salwan
Searty Adam Li
David P. Howell
Linux OS Technology
TeamIntel Corporation
2
So this is...
  • fusynRTNPTL steams off the OSDL CGL working
    group as a way to fill some of the POSIX
    real-time support gaps that Telecom manufacturers
    need to implement and port fault-proof systems
  • mutexes and condvars with futexes are FIFO
  • when a mutex owner dies, waiters wait forever
  • no priority inversion protection
  • Provides a synchronization infrastructure
    (mutexes, conditional variables) that is
  • POSIX soft Real-Time compatible
  • Robust
  • Safe from / mitigates priority inversion.
  • Available in the kernel and in user space

I would hate user space locking... page
2 Intel CorporationLinux OS Technology
Team7/20/04
3
The basic kernel stuff
spinlock queue ops
  • It all starts with a simple queue
  • queued waiters are sorted by priority.
  • queue operations (addition, removal, location of
    highest priority node) are O(1) to insure
    scalability and minimize latency.
  • if we change the priority of a waiter, it
    repositions in the queue.
  • This cannot be done with futexes without
    scalability issues.
  • To allow extensibility some functionality
    implemented in a fuqueue operations pointer.

() Who wants to pop the joke about the name?
I would hate user space locking... page
3 Intel CorporationLinux OS Technology
Team7/20/04
4
The fuqueue interface
include struct fuqueue fq
fuqueue_INIT (fq) fuqueue_wait (fq,
timeout) fuqueue_wake (fq, 9,
-EAGAIN) _wait() will block on fqit will
return the code passed to _wake(), -EINTR if
interrupted by signals, -ETIMEDOUT if too much
time went by. fuqueue_wait_fcb (fq, timeout,
flags, cb) fuqueue_wake_state (fq, 9, -EAGAIN,
TASK_STOPPED, 1
/sync/) Pathetic attemp to mimic some of the
waitqueue interfaces.
I would hate user space locking... page
4 Intel CorporationLinux OS Technology
Team7/20/04
5
fuqueue changing priority
Hook in priority/policy manipulation interfaces
struct task_struct ... struct fuqueue
fuqueue_wait struct fuqueue_waiter
fuqueue_waiter spinlock_t fuqueue_wait_lock .
.. fuqueue_waiter_chprio (task,
old_prio) Follow the link back to the fuqueue
the task is waiting for, reposition on the list
with queue chprio operation optionally recurse
depending on queue op (for priority inheritance).
I would hate user space locking... page
5 Intel CorporationLinux OS Technology
Team7/20/04
6
fuqueue cancelling wait
Same hook up fuqueue_waiter_cancel() in a bunch
of places process_timeout(), signalling,
__wake_up_common()should be happening in
try_to_wake_up(). Safely acquires locks and
determines which fuqueue the task is waiting for,
calls fuqueue-specific waiter_cancel() op from
fuqueue-ops and cleans task_struct
back-pointers. The spinlock protecting the
fuqueue is an IRQ spinlockcancels, as well as
wakes can be called from IRQ contexts (eg
timeouts). It is necessary correctly implementing
priority inheritance that the waker unqueues the
wakee and avoids re-taking a spinlock.
I would hate user space locking... page
6 Intel CorporationLinux OS Technology
Team7/20/04
7
fuqueue ops
struct fuqueue_ops void ( get) (struct
fuqueue ) void ( put) (struct fuqueue
) unsigned ( waiter_cancel) (struct fuqueue
, struct fuqueue_waiter ) struct
task_struct ( waiter_chprio) ( struct
task_struct , struct fuqueue , struct
fuqueue_waiter ) Provides flexibilitythe
fuqueue layer provides basic operations
fuqueue_op_() that other layers (ufuqueues,
fulocks, ufulocks) use. As well, most interface
functions are broken up for easy reuse.
I would hate user space locking... page
7 Intel CorporationLinux OS Technology
Team7/20/04
8
fulocks chubby kernel mutexes
Take a fuqueue, add the concept of an owner, a
node for a per-task ownership list, some flags
and you have a fulock.
fuqueue node flags owner
fuqueue node flags owner
fuqueue node flags owner
I would hate user space locking... page
8 Intel CorporationLinux OS Technology
Team7/20/04
9
fulocks locking
include struct fulock fl
fulock_INIT (fl) fl.flags FULOCK_FL_PI
FULOCK_FL_RM ... int result fulock_lock (fl,
timeout) After checking validity, take it if
unlocked, set ownership to current and exit.
Otherwise, use the fuqueue internal interface
to queue a fuqueue_waiter in the fulock's fuqueue
and wait on it. Returns 0 if you get the lock,
-EOWNERDEAD if you got it, but some previous
owner died, or other errors (-ETIMEDOUT on
timeout, -EINTR on signal, etc...).
I would hate user space locking... page
9 Intel CorporationLinux OS Technology
Team7/20/04
10
fulocks unlocking
  • include
  • ...
  • result fulock_unlock (fl, FULOCK_UNLOCK_AUTO)
  • If there is an owner, reset ownership, get the
    first waiter from the fulock' s fuqueue (if any),
    decide how to unlock (type), unqueue (and maybe
    transfer ownership) and wake it up.
  • Unlock type
  • serial first waiter is made the new owner,
    ownership transfer
  • parallel first waiter is woken up and
    recontends for the lock
  • auto serial if the first waiter is a real-time
    task, parallel if not.
  • Lock primitives automatically recontend for the
    lock in the parallel case.

I would hate user space locking... page
10 Intel CorporationLinux OS Technology
Team7/20/04
11
fulocks operations
struct fulock_ops struct fuqueue_ops
fuqueue unsigned ( owner_set) (fulock,
task) unsigned ( owner_reset) (fulock)
enum fulock_unlock_type ( unlock_type) (
code, fulock, fuqueue_waiter,
unlock_type, priv) void ( unqueue) (fulock,
fuqueue_waiter, unlock_type,
priv) void ( exit) (fulock) Provide
flexibility to build on top. fuqueue_ops are
reused.
I would hate user space locking... page
11 Intel CorporationLinux OS Technology
Team7/20/04
12
fulocks tracking ownership
struct task_struct ... struct plist
fulock_olist spinlock_t fulock_olist_lock
__fulock_op_owner_set (struct fulock ,
struct task_struct
) __fulock_op_owner_reset (struct fulock
) Add fulock to task-fulock_olist, set
fulock-owner...or reset fulock-owner and remove
from the list. The per-task ownership list is
sorted by fulock priority.
I would hate user space locking... page
12 Intel CorporationLinux OS Technology
Team7/20/04
13
fulocks robustness
Dies miserably
I would hate user space locking... page
13 Intel CorporationLinux OS Technology
Team7/20/04
14
fulocks robustness how
kernel/exit.c do_exit() ...
exit_fulocks() ... For each owned fulock in
current-fulock_olist, call ops-exit(). The
basic fulock_op_exit() adds the FULOCK_FL_DEAD
flag and unlocks serially. Waiter (or first
future contender) will acquire it with
-EOWNERDEAD. The key here is that the kernel
knows at all times who is owning which fulocks.
I would hate user space locking... page
14 Intel CorporationLinux OS Technology
Team7/20/04
15
fulocks priority inheritance

I would hate user space locking... page
15 Intel CorporationLinux OS Technology
Team7/20/04
16
fulocks priority protection

I would hate user space locking... page
16 Intel CorporationLinux OS Technology
Team7/20/04
17
PI PP the generic guts
Generic priority boosting facility in the
scheduler struct task_struct ... int
boost_prio ... ... unsigned
__prio_boost(task_t p, int prio) Whenever we
update the dynamic prio, do task-prio
min(new_dyn_prio, task-boost_prio) This is a
single hook into the effective_prio() hotpath,
two other hooks in __setscheduler() and
set_user_nice().
I would hate user space locking... page
17 Intel CorporationLinux OS Technology
Team7/20/04
18
PI PP the gluetwo rules
  • Priority inheritance everytime we manipulate
    any list, propagate if the list top priority
    changeboost every task we hit while propagating
    to it's fulock_olist's priority.
    __fulock_prio_update(), fuqueue_waiter_chprio()
    and __fulock_op_waiter_chprio().
  • Priority protection every time a task acquires
    a fulock, the olist_node of that fulock has the
    priority ceiling of the fulock. Goto 1).
  • That's it.
  • For this to happen without deadlocks, the
    unlocker has to unqueue the new locker and let it
    loose.

I would hate user space locking... page
18 Intel CorporationLinux OS Technology
Team7/20/04
19
PI PP simple case

I would hate user space locking... page
19 Intel CorporationLinux OS Technology
Team7/20/04
20
PI PP ownership-wait chain

Priority propagation is worst case O(n) on the
depth of the chainany other operation is O(1).
Wait here!
I would hate user space locking... page
20 Intel CorporationLinux OS Technology
Team7/20/04
21
vlocator, ufu vfuhabemus userspace
Need to provide a way for userspace to use this
infrastructure. Use the futex method, generic
way vlocator associate uaddr with kernel object
through hash table, manage its life cycle,
dispose it when its use count drops to zero.
vl_find() in hash table
vl_key_create()
vl_find_or_create() if not found, call
vl_ops-alloc() vl_ops-create(), add to hash,
return it. vl_dispose() detach from hash table,
destroy when refcount 0
I would hate user space locking... page
21 Intel CorporationLinux OS Technology
Team7/20/04
22
v,ufuqueue the futex wannabe
  • Associates a vfuqueue (representation in user
    space) to a ufuqueue that is really a fuqueue.
  • int sys_ufuqueue_wait (vfuqueue, val, timeout)
  • int sys_ufuqueue_wake (vfuqueue, howmany, code)
  • int sys_ufuqueue_ctl (vfuqueue, FUQUEUE_CTL_)
  • Imitates the futex behaviour, except for
  • extra error codes
  • wake up is scheduling policy based
  • responds to priority changes
  • TODO implement requeue

I would hate user space locking... page
22 Intel CorporationLinux OS Technology
Team7/20/04
23
v,ufulock the mutex
  • int sys_ufulock_lock (vfulock, flags, timeout)
  • int sys_ufulock_unlock (vfulock, flags, type)
  • int sys_ufulock_requeue (vfuqueue, val,
    vfulock,
  • flags)
  • int sys_ufulock_ctl (vfulock, flags,FULOCK_CTL)
  • Two modes of operation
  • KCO Kernel Controlled Ownership slow, all
    arches
  • Fast-path waay faster, needs atomic cmpxchg

I would hate user space locking... page
23 Intel CorporationLinux OS Technology
Team7/20/04
24
Fast-path why bother the kernel?
As in with futexes, there should be no need to go
to the kernel if there is no contention int
vfulock_lock (vfulock, flags, pid, timeout)
unsigned old VFULOCK_UNLOCKED if
(cmpxchg(vfulock,old,pid) ! old) return 0
return SYSCALL(ufulock_lock,3,vfulock,flags,to)
That's your userspace lock operation. The
vfulock state vfulock 0
UNLOCKED vfulock with PID vfulock VFULOCK_WP,DEAD Kernel
rules like KCO vfulock VFULOCK_NR Not
recoverable abort...
I would hate user space locking... page
24 Intel CorporationLinux OS Technology
Team7/20/04
25
Slow-path syncing up
If fast-path enabled, need to find_or_create() a
ufulock, sync it up and pass the operation down
to the fulock layer. State machine
vfulock_sync_nonkco()finds out what the kernel
knows about the ufulock, what user space thinks
about it and makes sure they are in sync. Fugly,
but it works )
No, ufulock is synced, owner known by kernel,
waiters present
Found task
Yes
Unlocked
Not found task died dead-owner!
I would hate user space locking... page
25 Intel CorporationLinux OS Technology
Team7/20/04
26
syncing miscelanea
  • Any sys_ufulock_() function needs to sync first
    thing.
  • sync state machine can detect and fix
    inconsistencies (when the kernel knows about the
    ufulock, it is easy, when not, it is a best
    guess).
  • Helps preventing lock stealing that causes
    priority inversion.
  • PIDs are reusedyou can miss some process deaths
    and assign ownerships wrong. Workarounds/solutions
    for that and false positives
  • Increase your PID space / tasks ratio default is
    32k pids
  • PIDX hash PID task-start_time kind of a lame
    task UUIDuse that for the vfulock and task
    lookup.
  • Look around task-mm, check if current is sharing
    the vfulock with that task.

I would hate user space locking... page
26 Intel CorporationLinux OS Technology
Team7/20/04
27
Back to fast-path unlocking
Unlocking might be more complex int
vfulock_unlock (vfulock, flags, unlock_type)
unsigned old_val vfulock while (old_val VFULOCK_WP) if (cmpxchg(vfulock,old_val,VFUL
OCK_UNLOCKED) old_val) return
0 old_val vfulock return
SYSCALL(ufulock_unlock,3,vfulock,
flags,unlock_type) Same thing try the fast
pathif it fails, let the kernel handle it.
I would hate user space locking... page
27 Intel CorporationLinux OS Technology
Team7/20/04
28
RTNPTL the POSIX glue
  • Modifications to NPTL to have mutexes and
    condvars take advantage of fusyn.
  • Implements POSIX tags TPI, TPP (soon), robustness
    (non-POSIX extension).
  • Improves compliance in other areas (unlock order
    by scheduling policy, for example).
  • Injected at the lll_() layer of glibc/NPTL.
  • Defaults to the fastest combination of unlock
    mode and fulock type (parallel, fast-path
    enabled).
  • Non-POSIX extensions allow to tweak fusyn
    specific parameters to suit user tastes.

I would hate user space locking... page
28 Intel CorporationLinux OS Technology
Team7/20/04
29
Example Priority Inheritance

I would hate user space locking... page
29 Intel CorporationLinux OS Technology
Team7/20/04
30
Performance Volanomark
Dual P3 933 MHz 256 KB cache, 512MB RAM Fedora2,
gcc 3.3.3 20040412 (RH Linux 3.3.3-7) glibc
2004-07-20 RTNPTL 2.3 kernel 2.6.5 fusyn
2.3 JVM IBMJava2-141 (-Xms64m -Xmx256m)

I would hate user space locking... page
30 Intel CorporationLinux OS Technology
Team7/20/04
31
Performance general
  • Considering the extra overhead and that we take
    one too many spinlocks with IRQs down, we are
    doing fine...
  • On a system identical to Volanomark's with
    slightly older snaphosts, we saw
  • serialized unlock()?lock() slow-path latency,
    6010µs (idle system), 11010µs (heavy network
    traffic) and 13010µs (heavy network traffic
    IDE traffic).
  • on some informal tests, jitter increases over
    NPTL less or equal to 1µs however, this was
    measured with the HRT patch, that gives a 10µs
    resolution. In fact, non-conclusive, or very
    small if you want to believe.
  • General operation is about 10 slower than NPTL,
    depending on the stress case. We need to solve
    that.

I would hate user space locking... page
31 Intel CorporationLinux OS Technology
Team7/20/04
32
Backup

I would hate user space locking... page
32 Intel CorporationLinux OS Technology
Team7/20/04
33
KCO the slow mode
ufulocks use the vfulock to maintain part of the
health state normal, dead-owner,
not-recoverable of the fulock (because it might
be wiped in the kernel when refcount drops to
zero). Step 1 synchronize what the kernel knows
of the fulock with what user space says it should
vfulock_sync_kco(). Step 2 operate on the
fulock normally (calling the fulock
layer)maintaining consistency with the vfulock
value. All transactions go through the
kernel. PP fulocks are always KCO they
manipulate task prio on every transitiononly the
kernel can do that.
I would hate user space locking... page
33 Intel CorporationLinux OS Technology
Team7/20/04
34
Plans
  • Cleanups optimizations especially RTNPTL
  • If it is wanted and my manager is convinced, rw
    fulocks with robustness, PI and PP. Unlikely,
    this is completely not POSIX and presents many
    more challenges.
  • Eliminate the hash-table lookup in the vlocator
    to avoid the last O(n) path...try different
    ideas.
  • ... fix bugs ...

I would hate user space locking... page
34 Intel CorporationLinux OS Technology
Team7/20/04
Write a Comment
User Comments (0)
About PowerShow.com