Multiprocessors, Threads and Microkernels - PowerPoint PPT Presentation

1 / 46
About This Presentation
Title:

Multiprocessors, Threads and Microkernels

Description:

graceful degradation in face of failures. Fred Kuhns () 3. Basic MP Architectures ... Reliability and fault Tolerance - degrade gracefully in the event of failures ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 47
Provided by: fredk9
Category:

less

Transcript and Presenter's Notes

Title: Multiprocessors, Threads and Microkernels


1
Multiprocessors, Threadsand Microkernels
  • Fred Kuhns

2
Motivation for Multiprocessors
  • Enhanced Performance -
  • Concurrent execution of tasks for increased
    throughput (between processes)
  • Exploit Concurrency in Tasks (Parallelism within
    process)
  • Fault Tolerance -
  • graceful degradation in face of failures

3
Basic MP Architectures
  • Single Instruction Single Data (SISD)
  • conventional uniprocessor designs.
  • Single Instruction Multiple Data (SIMD)
  • Vector and Array Processors
  • Multiple Instruction Single Data (MISD)
  • Not Implemented.
  • Multiple Instruction Multiple Data (MIMD)
  • conventional MP designs

4
MIMD Classifications
  • Tightly Coupled System - all processors share the
    same global memory and have the same address
    spaces (Typical SMP system).
  • Main memory for IPC and Synchronization.
  • Loosely Coupled System - memory is partitioned
    and attached to each processor. Hypercube,
    Clusters (Multi-Computer).
  • Message passing for IPC and synchronization.

5
MP Block Diagram
6
Memory Access Schemes
  • Uniform Memory Access (UMA)
  • Centrally located
  • All processors are equidistant (access times)
  • NonUniform Access (NUMA)
  • physically partitioned but accessible by all
  • processors have the same address space
  • NO Remote Memory Access (NORMA)
  • physically partitioned, not accessible by all
  • processors have own address space

7
Other Details of MP
  • Interconnection technology
  • Bus
  • Cross-Bar switch
  • Multistage Interconnect Network
  • Caching - Cache Coherence Problem!
  • Write-update
  • Write-invalidate
  • bus snooping

8
MP OS Structure - 1
  • Separate Supervisor -
  • all processors have own copy of the kernel.
  • Some share data for interaction
  • dedicated I/O devices and file systems
  • good fault tolerance but bad for concurrency
  • Master/Slave Configuration
  • Master monitors status and assigns work
  • Slaves schedulable pool of resources
  • master can be bottleneck
  • poor fault tolerance

9
MP OS Structure - 2
  • Symmetric Configuration - Most Flexible.
  • all processors are autonomous, treated equal
  • one copy of the kernel executed concurrently
    across all processors
  • Synchronized access to shared data structures
  • Lock entire OS - Floating Master
  • Mitigated by dividing OS into segments that
    normally have little interaction
  • multithread kernel and control access to
    resources (continuum)

10
MP Overview
MultiProcessor
SIMD
MIMD
Shared Memory (tightly coupled)
Distributed Memory (loosely coupled)
Symmetric (SMP)
Clusters
Master/Slave
11
SMP OS Design Issues
  • Threads - effectiveness of parallelism depends on
    performance of primitives used to express and
    control concurrency.
  • Process Synchronization - disabling interrupts is
    not sufficient.
  • Process Scheduling - efficient, policy
    controlled, task scheduling. Issues
  • Global versus Local (per CPU)
  • Task affinity for a particular CPU
  • resource accounting
  • inter-thread dependencies

12
SMP OS design issues - cont.
  • Memory Management - complication of shared main
    memory.
  • cache coherence
  • memory access synchronization
  • balancing overhead with increased concurrency
  • Reliability and fault Tolerance - degrade
    gracefully in the event of failures

13
Typical SMP System
CPU
CPU
CPU
CPU
500MHz
cache
MMU
cache
MMU
cache
MMU
cache
MMU
System/Memory Bus
  • Issues
  • Memory contention
  • Limited bus BW
  • I/O contention
  • Cache coherence

I/O subsystem
50ns
Bridge
INT
ether
System Functions (timer, BIOS, reset)
scsi
  • Typical I/O Bus
  • 33MHz/32bit (132MB/s)
  • 66MHz/64bit (528MB/s)

video
14
Some Useful Definitions
  • Parallelism degree to which a multiprocessor
    application achieves parallel execution
  • Concurrency Maximum parallelism an application
    can achieve with unlimited processors
  • System Concurrency kernel recognizes multiple
    threads of control in a program
  • User Concurrency User space threads (coroutines)
    provide a natural programming model for
    concurrent applications.

15
Introduction to Threads
Multithreaded Process Model
Single-Threaded Process Model
Thread
Thread
Thread
Thread Control Block
Thread Control Block
Thread Control Block
Process Control Block
User Stack
Process Control Block
User Stack
User Stack
User Stack
Kernel Stack
User Address Space
User Address Space
Kernel Stack
Kernel Stack
Kernel Stack
16
Process Concept Embodies
  • Unit of Resource ownership - process is allocated
    a virtual address space to hold the process image
  • Unit of Dispatching - process is an execution
    path through one or more programs
  • execution may be interleaved with other processes
  • These two characteristics are treated
    independently by the operating system

17
Threads
  • Effectiveness of parallel computing depends on
    the performance of the primitives used to express
    and control parallelism
  • Separate notion of execution from Process
    abstraction
  • Useful for expressing the intrinsic concurrency
    of a program regardless of resulting performance
  • We will discuss three examples of threading
  • User threads,
  • Kernel threads and
  • Scheduler Activations

18
Threads cont.
  • Thread Dynamic object representing an execution
    path and computational state.
  • One or more threads per process, each having
  • Execution state (running, ready, etc.)
  • Saved thread context when not running
  • Execution stack
  • Per-thread static storage for local variables
  • Shared access to process resources
  • all threads of a process share a common address
    space.

19
Thread States
  • Primary states
  • Running, Ready and Blocked.
  • Operations to change state
  • Spawn new thread provided register context and
    stack pointer.
  • Block event wait, save user registers, PC and
    stack pointer
  • Unblock moved to ready state
  • Finish deallocate register context and stacks.

20
User Level Threads
  • User level threads - supported by user level
    threads libraries
  • Examples
  • POSIX Pthreads, Mach C-threads, Solaris threads
  • Benefits
  • no modifications required to kernel
  • flexible and low cost
  • Drawbacks
  • can not block without blocking entire process
  • no parallelism (not recognized by kernel)

21
Kernel Level Threads
  • Kernel level threads - directly supported by
    kernel, thread is the basic scheduling entity
  • Examples
  • Windows 95/98/NT/2000, Solaris, Tru64 UNIX, BeOS,
    Linux
  • Benefits
  • coordination between scheduling and
    synchronization
  • less overhead than a process
  • suitable for parallel application
  • Drawbacks
  • more expensive than user-level threads
  • generality leads to greater overhead

22
Scheduler Activations
  • Attempt to combine benefits of both user and
    kernel threading support
  • blocking system call should not block whole
    process
  • user space library should make scheduling
    decisions
  • efficiency by avoiding unnecessary user, kernel
    mode switches.
  • Kernel assigns a set of virtual processors to
    each process. User library then schedules
    threads on these virtual processors.

23
Scheduler Activations
  • An activation
  • execution context for running thread
  • Kernel passes new activation to library when
    upcall is performed.
  • Library schedules user threads on activations.
  • space for kernel to save processor context of
    current user thread when stopped by kernel
  • upall performed when one of the following occurs
  • user thread performs blocking system call
  • blocked thread belonging to process, then its
    library is notified allowing it to either
    schedule a new thread or resume the preempted
    thread.

24
Pthreads
  • a POSIX standard (IEEE 1003.1c) API for thread
    creation and synchronization.
  • API specifies behavior of the thread library,
    implementation is up to development of the
    library.
  • Common in UNIX operating systems.

25
UNIX Support for Threading
  • BSD
  • process model only. 4.4 BSD enhancements.
  • Solaris
  • user threads, kernel threads, LWPs and in 2.6
    Scheduler Activations
  • Mach
  • kernel threads and tasks. Thread libraries
    provide semantics of user threads, LWPs and
    kernel threads.
  • Digital UNIX - extends MACH to provide usual UNIX
    semantics.
  • Pthreads library.

26
Solaris Threads
  • Supports
  • user threads (uthreads) via libthread and
    libpthread
  • LWPs, abstraction that acts as a virtual CPU for
    user threads.
  • LWP is bound to a kthread.
  • kernel threads (kthread), every LWP is associated
    with one kthread, however a kthread may not have
    an LWP
  • interrupts as threads

27
Solaris kthreads
  • Fundamental scheduling/dispatching object
  • all kthreads share same virtual address space
    (the kernels) - cheap context switch
  • System threads - example STREAMS, callout
  • kthread_t, /usr/include/sys/thread.h
  • scheduling info, pointers for scheduler or sleep
    queues, pointer to klwp_t and proc_t

28
Solaris LWP
  • Kernel provided mechanism to allow for both user
    and kernel thread implementation on one platform.
  • Bound to a kthread
  • LWP data (see /usr/include/sys/klwp.h)
  • user-level registers, system call params,
    resource usage, pointer to kthread_t and proc_t
  • All LWPs in a process share
  • signal handlers
  • Each may have its own
  • signal mask
  • alternate stack for signal handling
  • No global name space for LWPs

29
Solaris User Threads
  • Implemented in user libraries
  • library provides synchronization and scheduling
    facilities
  • threads may be bound to LWPs
  • unbound threads compete for available LWPs
  • Manage thread specific info
  • thread id, saved register state, user stack,
    signal mask, priority, thread local storage
  • Solaris provides two libraries libthread and
    libpthread.
  • Try man thread or man pthreads

30
Solaris Thread Data Structures
proc_t
p_tlist
kthread_t
t_procp
t_lwp
klwp_t
t_forw
lwp_thread
lwp_procp
31
Solaris Threading Model (Combined)
Process 2
Process 1
user
Int kthr
kernel
hardware
32
Solaris User Level Threads
Stop
Wakeup
Runnable
Continue
Stop
Stopped
Sleeping
Preempt
Dispatch
Stop
Sleep
Active
33
Solaris Lightweight Processes
Timeslice or Preempt
Stop
Running
Wakeup
Dispatch
Blocking System Call
Runnable
Stopped
Continue
Wakeup
Stop
Blocked
34
Solaris Interrupts
  • One system wide clock kthread
  • pool of 9 partially initialized kthreads per CPU
    for interrupts
  • interrupt thread can block
  • interrupted thread is pinned to the CPU

35
Solaris Signals and Fork
  • Divided into Traps (synchronous) and interrupts
    (asynchronous)
  • each thread has its own signal mask, global set
    of signal handlers
  • Each LWP can specify alternate stack
  • fork replicates all LWPs
  • fork1 only the invoking LWP/thread

36
Mach
  • Two abstractions
  • Task - static object, address space and system
    resources called port rights.
  • Thread - fundamental execution unit and runs in
    context of a task.
  • Zero or more threads per task,
  • kernel schedulable
  • kernel stack
  • computational state
  • Processor sets - available processors divided
    into non-intersecting sets.
  • permits dedicating processor sets tasks

37
Mach c-thread Implementations
  • Coroutine-based - multiple user threads onto a
    single-threaded task
  • Thread-based - one-to-one mapping from c-threads
    to Mach threads. Default.
  • Task-based - One Mach Task per c-thread.

38
Digital UNIX
  • Based on Mach 2.5 kernel
  • Provides complete UNIX programmers interface
  • 4.3BSD code and ULTRIX code ported to Mach
  • u-area replaced by utask and uthread
  • proc structure retained
  • Threads
  • Signals divided into synchronous and asynchronous
  • global signal mask
  • each thread can define its own handlers for
    synchronous signals
  • global handlers for asynchronous signals

39
Windows 2000 Threads
  • Implements the one-to-one mapping.
  • Each thread contains
  • - a thread id
  • - register set
  • - separate user and kernel stacks
  • - private data storage area

40
Linux Threads
  • Linux refers to them as tasks rather than
    threads.
  • Thread creation is done through clone() system
    call.
  • Clone() allows a child task to share the address
    space of the parent task (process)

41
4.4 BSD UNIX
  • Initial support for threads implemented but not
    enabled in distribution
  • Proc structure and u-area reorganized
  • All threads have a unique ID
  • How are the proc and u areas reorganized to
    support threads?

42
Microkernel
  • Transition to Microkernel discussion

43
Microkernel
  • Small operating system core
  • Contains only essential operating systems
    functions
  • Many services traditionally included in the
    operating system are now external subsystems
  • device drivers
  • file systems
  • virtual memory manager
  • windowing system and security services

44
Microkernel Benefits
  • Portability
  • isolate port specific code to microkernel
  • Reliability
  • modular design, small microkernel, simpler
    validation
  • Uniform interface
  • all services are provided by means of message
    passing
  • Extensibility
  • allows the addition of new services

45
Microkernel Benefits
  • Flexibility
  • existing features can be subtracted
  • Distributed system support
  • message are sent without knowing what the target
    machine is or where it is located
  • Object-oriented operating system
  • components are objects with clearly defined
    interfaces that can be interconnected to form
    software

46
Microkernel Design
  • Primitive memory management
  • mapping each virtual page to a physical page
    frame grant, map and flush.
  • Inter-process communication
  • I/O and interrupt management
Write a Comment
User Comments (0)
About PowerShow.com