Title: Multithreading patterns
1Multithreading patterns
Cristian Nicola Development Manager Net
Evidence (SLM) Ltd http//www.tonicola.com
cn_at_net-evidence.com
2 1. Introduction to multithreading 2.
Multithreading patterns
31. Introduction to multithreading
4In this section
- Why do multi-threading?
- When and when not to use threads?
- Multithreading basic structures (Critical
sections, Mutexes, Events, Semaphores and Timers) - Multithreading problems (atomic operations, race
conditions, priority inversion, deadlocks,
livelocks, boxcar / lock convoys / thundering
herd)
5Why multi-threading?
- Multi-core / multi-CPU machines are now standard
- Makes programming more fun
6When to use threads?
- Clearly defined work-tasks, and the work-tasks
are long enough - Data needed to complete the work tasks does not
overlap (or maybe just a little) - Generally UI interaction is not needed
background tasks
7When NOT to use threads?
- Work-tasks are not clearly defined
- There is a lot of shared data between the tasks
- UI interaction is a requirement
- Work-tasks are small
- You do not have a good reason to use it
8Multithreading structures
9Jobs, processes, threads, fibers
Job 1
Process 1
Process N
Thread 1
Thread 2
Thread M
Fiber 1
Fiber 2
Fiber X
10What we needa way to
- avoid simultaneous access to a common resource
(mutexes, critical sections) - signal an occurrence or an action (events)
- restrict/throttle the access to some shared
resources (semaphores) - signal a due time sometimes periodically
(timers)
11Critical sections
- User object - lightweight
- Their number is limited by memory
- Re-entrant
- Very fast when no collisions (10s of
instructions) - Downgrades to a kernel object when locked
- No time-out
12Mutexes
- Kernel object
- Can be named for inter-process communication
- Can have security flags
- Can be inherited by child processes
- Can be acquired/released
13Events
- Kernel object
- Can be named for inter-process communication
- Can have security flags
- Can be inherited by child processes
- Holds a state signalled, non-signalled
- Can be auto-reset - PulseEvent (should not be
used) - Auto-reset events are NOT re-entrant
14Semaphores
- Kernel object
- Can be named for inter-process communication
- Can have security flags
- Can be inherited by child processes
- Have a count property, but it cannot be
interrogated - Signalled when count gt 0
15Timers
- Kernel object
- Can be named for inter-process communication
- Can have security flags
- Can be inherited by child processes
- Can be auto-reset
16Kernel-land / User-land
- Kernel transition expensive
- User transition fast
- Should avoid kernel transitions when possible
(system calls, usage of kernel objects, un-needed
thread creation or destruction)
17Multithreading problems
18Atomic operations
- A set of operations that must be executed as a
whole, so they appear to the rest of the system
to be a single operation - There can be 2 outcomes
- - success
- - failure
19Atomic operations
- For example the code
- I J 1
- Can be compiled as
- MOV EAX, EBP-10
- INC EAX
- MOV EBP-0C, EAX
Solution Lock I J 1 Unlock
Possible task switch
Possible task switch
20Race conditions
- A task switch can occur any time
21Race conditions
- When 2 threads race to change the data
- Problem
- Unpredictable result
22Race conditions
Example 2 threads incrementing a variable by
1 If we start from 1 then the expected result
would be 3
Input A 1
Thread 1 Read A1 into register Increment
register Write register 2 into A in
memory
Thread 2 Read A1 into register Increment
register Write register 2 into A in memory
Output A 2
23Priority inversion
- A thread with a higher priority waits for a
resource used by a thread with a lower priority - Problem
- A high priority thread is executed less often
than a lower priority thread
24Priority inversion
Example 2 threads accessing the same file
Thread 1 (low priority) Lock a file for usage
writing some data into it Do some more work
with the file Release the file
Thread 2 (high priority) Wait for the file to
be available Use the file
Out of 3 switches Low priority 2 High priority 1
25Deadlock
- 2 or more actions depend on each other for
completion, and as a result none finishes - Problem
- One or more threads stop working for indefinite
amounts of time
26Deadlock conditions
- 1. Mutual exclusion locking of resources
- 2. Resources are locked while others are waited
for - 3. Pre-emption while holding resources is
permitted - 4. A circular wait condition exists
27Deadlock
Example 2 threads accessing the same resources
- Thread 1
- Lock resource A
- Wait for resource B to
- be available
Thread 2 Lock resource B Wait for
resource A to be available
Both threads are now stopped, no way to wake up
28Livelock
- Same as deadlock, except the detection/prevention
of deadlocks would wake up the threads, without
progressing - Problem
- One or more threads do not progress, they do
spin - 2 people travelling in opposite directions, each
other is polite and moves aside to make space
none of them can pass as the move from side to
side
29Boxcar / Lock Convoys / Thundering herd
- Can have a serious performance penalty
- The application would work fine
- A certain flag wakes up many threads, however
only the first one has work to do - Problem
- Threads wake up, wait on a resource and then
there is no work to do
30Boxcar / Lock Convoys / Thundering herd
Example 2 threads wake up to use the same
resource
Thread 1 Sleep waiting for event Lock
data Use data Unlock data Go back to
sleep
Thread 2 Sleep waiting for event Wait for
the data lock to be available Lock data
Nothing to do Unlock data Go back to sleep
Flag is signalled
312. Multithreading patterns
32In this section
- What is a design pattern?
- Groups of patterns (control-flow patterns, data
patterns, resource patterns, exception/error
patterns) - Multithreading patterns sources
33What is a design pattern?
- A design pattern is a reusable solution to a
recurring problem in the context of object
oriented development - Patterns can be about other topics
34Groups of patterns
- Control-flow aspects related to control and flow
dependencies between various threads (e.g.
parallelism, choice, synchronization) - Data perspective passing of information ,
scoping of variables, etc - Resource perspective resource to thread
allocation, delegation, etc. - Exception handling various causes of exceptions
and the various actions that need to be taken as
a result of exceptions occurring
35Control-flow patterns
36Worker threads
- Sometimes referred as Active Object, Cyclic
Executive or Concurrency Pattern - Generic threads doing some work without being
aware of what kind of work they do - They share a common work queue
- Very useful in highly parallel systems
37Worker threads
- Windows Vista/Server has API support for creating
thread pools (CreateThreadpool) - Use a semaphore to limit the number of active
threads to a number compared to the CPUs count
(usually 2 x CPU)
38Worker threads - variants
- Background Worker Patternnotifications when the
thread completes, but provides an update on the
status of the operation - May need a cancel of the operation
- Asynchronous Results Patternyou are more
interested in the result than the actual status
of the operations
39Worker threads - Termination
- Implicit Termination
- the worker has finished its work and can end
- Explicit Termination
- the worker is asked to terminate
40Scheduler
- Explicitly control when threads may execute
single-threaded code (sequences waiting threads) - Independent mechanism to implement a scheduling
policy - Read/Write lock is usually implemented using the
scheduler pattern to ensure fairness in
scheduling - Adds significant overhead
41Thread pool
- A number of threads are created to perform a
number of tasks, usually organized in a queue - There are many more tasks than threads
- When thread completes its task
- If more tasks -gt request the next task from the
queue - If no more tasks -gt it terminates, or sleeps
- Number of threads used is a parameter that can be
tuned - can be dynamic based on the number of
waiting tasks
42Thread pool
- The creating or destroying algorithm impacts
overall performance - Create too many threads resources and time are
wasted - Destroy too many threads time spent re-creating
- Creating threads too slowly poor client
performance - Destroying threads too slowly starvation of
resources - Negates thread creation and destruction overhead
- Better performance and better system stability
43Thread pool - triggers
- Transient Trigger
- Offers the capability to signal currently running
threads - They are lost if not acted upon right away
- Persistent Trigger
- Generally it would result in the pool actions
- They would be persisted and would eventually be
handled
44Message Queuing
- Asynchronous communications, implemented via
queued messages - Simple, without mutual exclusion problems
- No resource is shared by reference
- The shared information is passed by value
45Interrupt
- Occurs when the event of interest occurs
- Executes very quickly and with little overhead
- Provide a means for timely response to urgent
needs - There are circumstances when their use can lead
to system failure - Asynchronous procedure calls (APC)
46Guarded Call
- Used when it may not be possible to wait for an
asynchronous rendezvous - The call of the method of the appropriate object
in the other thread can lead to mutual exclusion
problems if the called object is currently active
doing something else - The Guarded Call Pattern handles this case
through the use of a mutual exclusion semaphore
47Rendezvous
- Concerned with modelling the preconditions for
synchronization or rendezvous of threads - ready threads registers with the Rendezvous class
- then blocks until the Rendezvous class releases
it to run - Build a collaboration structure that allows any
arbitrary set of preconditions to be met for
thread synchronization, - Independent of task phrasings, scheduling
policies, and priorities
48Data patterns
49Thread-Specific Storage
- Also called thread-local storage
- Any function in that thread will get the same
value, TLS is allocated per thread - Similar to global storage - unlike global
storage, functions in another thread will not get
the same value - Thread specific storage sometimes refers to the
private virtual address space of a running task
50Static Allocation
- Dynamic memory problems
- nondeterministic timing of memory allocation and
de-allocation - memory fragmentation
- Simple approach to solving both these problems
disallow dynamic memory allocation - Only used simple systems with highly predictable
and consistent loads - All objects are allocated during system
initialization (the system takes longer to
initialize, but it operates well during execution)
51Pool Allocation
- Involves creating of pools of objects at start-up
- Doesn't address needs for dynamic memory
- The pools are not necessarily initialized at
start-up - The pools are available upon request
52Fixed Sized Buffer
- Memory fragmentation occurs when
- The order of allocation is independent of the
release order - Memory is allocated in various sizes from the
heap - Used when we cannot tolerate dynamic allocation
problems like fragmentation - Fragmentation-free dynamic memory allocation at
the cost of loss of memory usage optimality - Similar to a dynamic allocation but only allows
fixed pre-defined sizes to be allocated
53Garbage Collection
- Solves memory leaks and dangling pointers
- It does not address memory fragmentation
- Takes the programmer out of the loop
- Adds run-time overhead
- Adds a loss of execution predictability
54Garbage Compactor
- Removes memory fragmentation
- Maintains two memory segments in the heap
- Moves live objects from one segment to the next
- The free memory in on of the segments is a
contiguous block
55Resource patterns
56Locked structures
- Structures that use a locking mechanism
- Easy to implement, easy to debug
- Can deadlock
- Do not scale well
57Lock-free structures
- They do not need to lock
- They need hardware support (e.g. compare-and-swap
instructions) - They can burn CPU
- Hard to implement and debug
58Wait-free structures
- Same as lock free structures, but there is a
guarantee they would finish in a certain number
of steps - All wait-free structures are lock-free
- Very difficult to implement
- Very few real life applications
59Single writer / multi reader
- Special kind of lock that would allow multiple
read access to the data but only a single write
(exclusive write access) - Problems on promoting from read to write (reader
starvation, writers starvation) Scheduler
pattern
60Double-checked locking
- Also known as "Double-Checked Locking
Optimization" - Reduces the overhead of acquiring a lock
- Used for implementing "lazy initialization" in a
multi-threaded environment - If check failed then
- Lock
- If check failed then
- Initialize
- Unlock
61Shared Memory
- Common memory area addressable by multiple
processors - Almost always involves a combined
hardware/software solution - If the data to be shared is read-only then
concurrency protection mechanisms may not be
required - Used when responses to messages and events are
not desired or too slow
62Simultaneous Locking
- Deadlocks avoidance
- Works in an all-or-none fashion
- Prevents the condition of holding some resources
while requesting others - Allows higher-priority tasks to run if they don't
need any of the locked resources
63Ordered Locking
- Eliminates deadlocks
- Orders resources and enforcing an ordered policy
in which resources must be allocated - If enforced then no circular waiting condition
can ever occur - Explicitly lock and release the resources
- Has the potential for neglecting to unlock the
resource exists
64Exception/error patterns
65Exceptions/errors
- Work failure
- Deadline expiry
- Resource unavailability
- External trigger
- Constraint violation
- Handling
- Continue
- Remove work item
- Remove all items
- Recovery
- no action
- rollback
- compensate
66Balking
- Executes an action on an object when the object
is in a particular state - An attempt to use the object out of its legal
state would result in an "Illegal State Exception"
67Triple Modular Redundancy
- Used when there is no fail-safe state
- Based on an odd number of channels operating in
parallel - The computational results or resulting actuation
signals are compared, and if there is a
disagreement, then a two-out-of-three majority
wins - Any deviating computation of the third channel is
discarded
68Watchdog
- Lightweight and inexpensive
- Minimal coverage
- Watches out over processing of another component
- Usually checks a computation time base
- or ensures that computation steps are
proceeding in a predefined order
69Multithreading patterns sources
- http//www.workflowpatterns.com
- Real-Time Design Patterns Robust Scalable
Architecture for Real-Time Systems by Bruce
Powel Douglass
70 Questions ?
71Big thank you!