Multiprocessors, Threads and Microkernels

About This Presentation

Title:

Multiprocessors, Threads and Microkernels

Description:

graceful degradation in face of failures. Fred Kuhns () 3. Basic MP Architectures ... Reliability and fault Tolerance - degrade gracefully in the event of failures ... – PowerPoint PPT presentation

Number of Views:43

Avg rating:3.0/5.0

Slides: 47

Provided by: fredk9

Category:

more less

Transcript and Presenter's Notes

Title: Multiprocessors, Threads and Microkernels

1
Multiprocessors, Threadsand Microkernels

Fred Kuhns

2
Motivation for Multiprocessors

Enhanced Performance -
Concurrent execution of tasks for increased
throughput (between processes)
Exploit Concurrency in Tasks (Parallelism within
process)
Fault Tolerance -
graceful degradation in face of failures

3
Basic MP Architectures

Single Instruction Single Data (SISD)
conventional uniprocessor designs.
Single Instruction Multiple Data (SIMD)
Vector and Array Processors
Multiple Instruction Single Data (MISD)
Not Implemented.
Multiple Instruction Multiple Data (MIMD)
conventional MP designs

4
MIMD Classifications

Tightly Coupled System - all processors share the
same global memory and have the same address
spaces (Typical SMP system).
Main memory for IPC and Synchronization.
Loosely Coupled System - memory is partitioned
and attached to each processor. Hypercube,
Clusters (Multi-Computer).
Message passing for IPC and synchronization.

5
MP Block Diagram
6
Memory Access Schemes

Uniform Memory Access (UMA)
Centrally located
All processors are equidistant (access times)
NonUniform Access (NUMA)
physically partitioned but accessible by all
processors have the same address space
NO Remote Memory Access (NORMA)
physically partitioned, not accessible by all
processors have own address space

7
Other Details of MP

Interconnection technology
Bus
Cross-Bar switch
Multistage Interconnect Network
Caching - Cache Coherence Problem!
Write-update
Write-invalidate
bus snooping

8
MP OS Structure - 1

Separate Supervisor -
all processors have own copy of the kernel.
Some share data for interaction
dedicated I/O devices and file systems
good fault tolerance but bad for concurrency
Master/Slave Configuration
Master monitors status and assigns work
Slaves schedulable pool of resources
master can be bottleneck
poor fault tolerance

9
MP OS Structure - 2

Symmetric Configuration - Most Flexible.
all processors are autonomous, treated equal
one copy of the kernel executed concurrently
across all processors
Synchronized access to shared data structures
Lock entire OS - Floating Master
Mitigated by dividing OS into segments that
normally have little interaction
multithread kernel and control access to
resources (continuum)

10
MP Overview
MultiProcessor
SIMD
MIMD
Shared Memory (tightly coupled)
Distributed Memory (loosely coupled)
Symmetric (SMP)
Clusters
Master/Slave
11
SMP OS Design Issues

Threads - effectiveness of parallelism depends on
performance of primitives used to express and
control concurrency.
Process Synchronization - disabling interrupts is
not sufficient.
Process Scheduling - efficient, policy
controlled, task scheduling. Issues
Global versus Local (per CPU)
Task affinity for a particular CPU
resource accounting
inter-thread dependencies

12
SMP OS design issues - cont.

Memory Management - complication of shared main
memory.
cache coherence
memory access synchronization
balancing overhead with increased concurrency
Reliability and fault Tolerance - degrade
gracefully in the event of failures

13
Typical SMP System
CPU
CPU
CPU
CPU
500MHz
cache
MMU
cache
MMU
cache
MMU
cache
MMU
System/Memory Bus

Issues
Memory contention
Limited bus BW
I/O contention
Cache coherence

I/O subsystem
50ns
Bridge
INT
ether
System Functions (timer, BIOS, reset)
scsi

Typical I/O Bus
33MHz/32bit (132MB/s)
66MHz/64bit (528MB/s)

video
14
Some Useful Definitions

Parallelism degree to which a multiprocessor
application achieves parallel execution
Concurrency Maximum parallelism an application
can achieve with unlimited processors
System Concurrency kernel recognizes multiple
threads of control in a program
User Concurrency User space threads (coroutines)
provide a natural programming model for
concurrent applications.

15
Introduction to Threads
Multithreaded Process Model
Single-Threaded Process Model
Thread
Thread
Thread
Thread Control Block
Thread Control Block
Thread Control Block
Process Control Block
User Stack
Process Control Block
User Stack
User Stack
User Stack
Kernel Stack
User Address Space
User Address Space
Kernel Stack
Kernel Stack
Kernel Stack
16
Process Concept Embodies

Unit of Resource ownership - process is allocated
a virtual address space to hold the process image
Unit of Dispatching - process is an execution
path through one or more programs
execution may be interleaved with other processes
These two characteristics are treated
independently by the operating system

17
Threads

Effectiveness of parallel computing depends on
the performance of the primitives used to express
and control parallelism
Separate notion of execution from Process
abstraction
Useful for expressing the intrinsic concurrency
of a program regardless of resulting performance
We will discuss three examples of threading
User threads,
Kernel threads and
Scheduler Activations

18
Threads cont.

Thread Dynamic object representing an execution
path and computational state.
One or more threads per process, each having
Execution state (running, ready, etc.)
Saved thread context when not running
Execution stack
Per-thread static storage for local variables
Shared access to process resources
all threads of a process share a common address
space.

19
Thread States

Primary states
Running, Ready and Blocked.
Operations to change state
Spawn new thread provided register context and
stack pointer.
Block event wait, save user registers, PC and
stack pointer
Unblock moved to ready state
Finish deallocate register context and stacks.

20
User Level Threads

User level threads - supported by user level
threads libraries
Examples
POSIX Pthreads, Mach C-threads, Solaris threads
Benefits
no modifications required to kernel
flexible and low cost
Drawbacks
can not block without blocking entire process
no parallelism (not recognized by kernel)

21
Kernel Level Threads

Kernel level threads - directly supported by
kernel, thread is the basic scheduling entity
Examples
Windows 95/98/NT/2000, Solaris, Tru64 UNIX, BeOS,
Linux
Benefits
coordination between scheduling and
synchronization
less overhead than a process
suitable for parallel application
Drawbacks
more expensive than user-level threads
generality leads to greater overhead

22
Scheduler Activations

Attempt to combine benefits of both user and
kernel threading support
blocking system call should not block whole
process
user space library should make scheduling
decisions
efficiency by avoiding unnecessary user, kernel
mode switches.
Kernel assigns a set of virtual processors to
each process. User library then schedules
threads on these virtual processors.

23
Scheduler Activations

An activation
execution context for running thread
Kernel passes new activation to library when
upcall is performed.
Library schedules user threads on activations.
space for kernel to save processor context of
current user thread when stopped by kernel
upall performed when one of the following occurs
user thread performs blocking system call
blocked thread belonging to process, then its
library is notified allowing it to either
schedule a new thread or resume the preempted
thread.

24
Pthreads

a POSIX standard (IEEE 1003.1c) API for thread
creation and synchronization.
API specifies behavior of the thread library,
implementation is up to development of the
library.
Common in UNIX operating systems.

25
UNIX Support for Threading

BSD
process model only. 4.4 BSD enhancements.
Solaris
user threads, kernel threads, LWPs and in 2.6
Scheduler Activations
Mach
kernel threads and tasks. Thread libraries
provide semantics of user threads, LWPs and
kernel threads.
Digital UNIX - extends MACH to provide usual UNIX
semantics.
Pthreads library.

26
Solaris Threads

Supports
user threads (uthreads) via libthread and
libpthread
LWPs, abstraction that acts as a virtual CPU for
user threads.
LWP is bound to a kthread.
kernel threads (kthread), every LWP is associated
with one kthread, however a kthread may not have
an LWP
interrupts as threads

27
Solaris kthreads

Fundamental scheduling/dispatching object
all kthreads share same virtual address space
(the kernels) - cheap context switch
System threads - example STREAMS, callout
kthread_t, /usr/include/sys/thread.h
scheduling info, pointers for scheduler or sleep
queues, pointer to klwp_t and proc_t

28
Solaris LWP

Kernel provided mechanism to allow for both user
and kernel thread implementation on one platform.
Bound to a kthread
LWP data (see /usr/include/sys/klwp.h)
user-level registers, system call params,
resource usage, pointer to kthread_t and proc_t
All LWPs in a process share
signal handlers
Each may have its own
signal mask
alternate stack for signal handling
No global name space for LWPs

29
Solaris User Threads

Implemented in user libraries
library provides synchronization and scheduling
facilities
threads may be bound to LWPs
unbound threads compete for available LWPs
Manage thread specific info
thread id, saved register state, user stack,
signal mask, priority, thread local storage
Solaris provides two libraries libthread and
libpthread.
Try man thread or man pthreads

30
Solaris Thread Data Structures
proc_t
p_tlist
kthread_t
t_procp
t_lwp
klwp_t
t_forw
lwp_thread
lwp_procp
31
Solaris Threading Model (Combined)
Process 2
Process 1
user
Int kthr
kernel
hardware
32
Solaris User Level Threads
Stop
Wakeup
Runnable
Continue
Stop
Stopped
Sleeping
Preempt
Dispatch
Stop
Sleep
Active
33
Solaris Lightweight Processes
Timeslice or Preempt
Stop
Running
Wakeup
Dispatch
Blocking System Call
Runnable
Stopped
Continue
Wakeup
Stop
Blocked
34
Solaris Interrupts

One system wide clock kthread
pool of 9 partially initialized kthreads per CPU
for interrupts
interrupt thread can block
interrupted thread is pinned to the CPU

35
Solaris Signals and Fork

Divided into Traps (synchronous) and interrupts
(asynchronous)
each thread has its own signal mask, global set
of signal handlers
Each LWP can specify alternate stack
fork replicates all LWPs
fork1 only the invoking LWP/thread

36
Mach

Two abstractions
Task - static object, address space and system
resources called port rights.
Thread - fundamental execution unit and runs in
context of a task.
Zero or more threads per task,
kernel schedulable
kernel stack
computational state
Processor sets - available processors divided
into non-intersecting sets.
permits dedicating processor sets tasks

37
Mach c-thread Implementations

Coroutine-based - multiple user threads onto a
single-threaded task
Thread-based - one-to-one mapping from c-threads
to Mach threads. Default.
Task-based - One Mach Task per c-thread.

38
Digital UNIX

Based on Mach 2.5 kernel
Provides complete UNIX programmers interface
4.3BSD code and ULTRIX code ported to Mach
u-area replaced by utask and uthread
proc structure retained
Threads
Signals divided into synchronous and asynchronous
global signal mask
each thread can define its own handlers for
synchronous signals
global handlers for asynchronous signals

39
Windows 2000 Threads

Implements the one-to-one mapping.
Each thread contains
- a thread id
- register set
- separate user and kernel stacks
- private data storage area

40
Linux Threads

Linux refers to them as tasks rather than
threads.
Thread creation is done through clone() system
call.
Clone() allows a child task to share the address
space of the parent task (process)

41
4.4 BSD UNIX

Initial support for threads implemented but not
enabled in distribution
Proc structure and u-area reorganized
All threads have a unique ID
How are the proc and u areas reorganized to
support threads?

42
Microkernel

Transition to Microkernel discussion

43
Microkernel

Small operating system core
Contains only essential operating systems
functions
Many services traditionally included in the
operating system are now external subsystems
device drivers
file systems
virtual memory manager
windowing system and security services

44
Microkernel Benefits

Portability
isolate port specific code to microkernel
Reliability
modular design, small microkernel, simpler
validation
Uniform interface
all services are provided by means of message
passing
Extensibility
allows the addition of new services

45
Microkernel Benefits

Flexibility
existing features can be subtracted
Distributed system support
message are sent without knowing what the target
machine is or where it is located
Object-oriented operating system
components are objects with clearly defined
interfaces that can be interconnected to form
software