Title: Concurrent Processes and Programming
1Concurrent Processes and Programming
2Outline
- Processes and Threads
- Graph Models for Process Representation
- The Client/Server Model
- Time Services
- Language Mechanism for Synchronization
- Object Model Resource Servers
- Concurrent Programming Languages
- Distributed and Network Programming
3Introduction
- Process and thread most essential elements in OS
- Management functions for processes
- Communication and synchronization Chapters 3
and 4 - Strongly related
- Scheduling Chapter 5
4Processes and Threads
5Process Characteristics
- Process a program in execution. A process
includes (process image) - Program code (binary executable), user data
- Current activities PC, registers, file opened
- Process Context Stored in Process Control Block
(PCB) - Stack Temporary data (parameter, local
variable) - Process can be viewed as
- Unit of resource ownership - process is
allocated - A virtual address space to hold the process image
- Control of some resources (files, I/O devices...)
- Unit of dispatching - process is a single
execution path determined by the PC - Dispatch unit of short-term scheduler
- Execution may be interleaved with other process
- The process has an execution state and a
dispatching priority
6Process Characteristics (Cont.)
- The two characteristics are treated independently
by some modern OS - The unit of dispatching is usually referred to a
thread or a lightweight process - The unit of resource ownership is usually
referred to as a process or task
7Threads and Processes
MS-DOS
JVM
Traditional UNIX
NT, Solaris, Mach, OS/2
8A Very Very Simple Example
Single Thread. for(i0 ilt9 i) for (j0 j
lt 9 j) numberij (i1) (j1) The
only thread is scheduling and dispatching by CPU
scheduler If 9 other processes are ready to run,
this thread may only get 1/10 CPU Time
Multiple Threads. Thread 1 for (j0 j lt 9
j) number0j 1 (j1) Thread 2 for
(j0 j lt 9 j) number1j 2
(j1) The 9 threads are scheduling and
dispatching by CPU scheduler9/(99) CPU Time
9Threads
Process Context-Switch ? Thread CS Process
Scheduling ? Thread Scheduling
- Has an execution state (running, ready, etc.)
- Scheduling and dispatching is done on a thread
basis - Saves thread context when not running
- Independent PC, CPU registers (Thread control
block) - Has an execution stack and some per-thread static
storage for local variables - Has shared access to the memory address space and
resources of its process - All threads of a process share this
- When one thread alters a (non-private) memory
item, all other threads (of the process) sees
that - A file opened by one thread is available to others
10Single Threaded and Multithreaded Process Models
Thread Control Block contains a register image,
thread priority and thread state information
11Single and Multithreaded Processes
Separate Thread ID, PC, register set and stack
12Two-Level Concurrency of Processes and Threads
Native Computer System
Single Thread Process
Multiple Thread Processes
Thread Run-Time Library Support
Native Operating System
13Thread Applications
Client
Terminal Server
File Server
Main
Thread
Thread
Buffer
Requests
Write
Read
Thread
Thread
Thread
Thread
Identical Static Threads
Concurrent andAsynchronous Requests
Dynamic Threads With Dispatcher
14Thread Applications (Cont.)
- Terminal servers (TS)
- Without multiple threads, the TS needs to poll
terminal inputs - Each thread is responsible for one particular
input - Use common reentrant code and separate local
stack - Accesses to the common buffer need to be mutually
exclusion - By shared-memory synchronization methods
(semaphore or monitor) - Threads can be created statically and can run
indefinitely - File server
- Perform different operations upon client request
- A main thread acts as a work dispatcher
- Threads are created and destroyed dynamically
- A thread can be created for each operation and
control is returned to the main thread so that it
can accept new requests - A thread is destroyed when its work is completed
15Thread Applications (Cont.)
- Clients
- Can make multiple requests to different servers
- Concurrent update of replicated file copies
managed by multiple file servers - Application scenarios
- Multiple thread Web browser
- Multiple thread Web server
- Multiple thread multiple-window applications
16Benefits of Threads Over Processes
- Less time to create a new thread than a process
- Less time to terminate a thread than a process
- Less time to switch between two threads within
the same process - Since threads within the same process share
memory and files, they can communicate with each
other without invoking the kernel
If there is an application that should be
implemented as a set of related units of
execution, it is far more efficient to do so as a
collection of threads rather than a collection of
separate processes.
17Benefits of Threads
- Responsiveness
- Resource Sharing
- Economy
- More economical to create and context switch
threads than processes - In Solaris. process vs. thread
- 30 times slower for creating
- 5 times slower for context switching
- Utilization of MP Architectures
- Each thread may be running in parallel on a
different processor
18Remote Procedure Call Using Single Thread
19Remote Procedure Call Using Threads
20Thread Synchronization Problem
- 3 variables A, B, C which are shared by thread
T1 and thread T2 - T1 computes C AB
- T2 transfers amount X from A to B
- T2 must do A A -X and B BX (so that AB is
unchanged) - But if T1 computes AB after T2 has done A A-X
but before B BX - then T1 will not obtain the correct result for C
A B
Concurrent access to shared data needs to be
mutually exclusive (by semaphores and monitors)
Similar situation will occur for cooperating
processes
21User Space Threads
Emulate by Thread Library for the illusion of
separate Thread ID, PC, register set and stack
- Managed by user-level run-time Threads Library
- No support from the kernel (kernel is not aware
of the existence of threads) - Fast to create and manage, flexible scheduling
- Block all threads for a blocking system call if
kernel is single threaded - Cannot take advantage of multi-processing
- Scheduling normally non-preemptive,
FCFSpriority - Examples POSIX Pthreads, Mach C-threads, Solaris
threads - Thread primitives included in the user-level
run-time library - Thread management for thread creation,
suspension, termination - Assignment of priority and other thread
attributes - Synchronization and communication support such as
semaphore, monitor, and message passing
22Kernel Space Threads
- Supported by the Kernel
- Kernel maintains context information for the
process and the threads - Scheduling is done on a thread basis
- Examples
- Windows 95/98/NT
- Solaris
- Digital UNIX
Thread is the basic unit for CPU scheduling
23User Space Thread VS. Kernel Space Thread
- Efficiency in thread creation, destroy, and
switch - Share of CPU time
- Blocking system call
- Scheduling decision and priority
24Multithreading Models
- How to map user threads to kernel threads
- Many-to-One Model
- One-to-One Model
- Many-to-Many Model
25Many-to-One Model
- Many user-level threads mapped to single kernel
thread - Thread management is done in user space
- Blocking problem
- No support for running in parallel on MP
- Used on systems that do not support kernel
threads.
Thread Library
26One-to-One Model
- Each user-level thread maps to a kernel thread
- Creating a user thread requires creating the
corresponding kernel thread - Overhead
- Restrict the number of threads supported by the
OS - Examples
- Windows 95/98/NT
- OS/2
27Many-to-Many Model
- Multiplex many user-level threads to a smaller or
equal number of kernel threads - Many-to-One No true concurrency
- One-to-One Be careful not to create to many
threads within an application - Examples
- Solaris
- IRIX
- Digital UNIX
28Solaris 2 Threads
Many-to-Many
29Solaris 2 Threads (Cont.)
- Solaris Threads
- Three-level concurrency of a preemptive
multithreaded kernel mentioned in Figure 3.3 - User-level threads API library for thread
management - Lightweight process (LWP) intermediate level
- Each process contains at least one LWP
- Thread library multiplexes user threads on the
pool of LWPs for the process - Only user-level threads currently connect to an
LWP accomplish work - Thread library dynamically adjust of LWP
30Solaris 2 Threads (Cont.)
- Solaris Threads (Cont.)
- Kernel-level threads
- A kernel-level thread for each LWP
- Some kernel threads for kernel tasks (no LWP)
- Kernel threads are the only objects scheduled
within the system - Bounded and unbounded user-level threads
- Bound permanently attached to a LWP
- For processes required quick response time
- Unbound multiplex
31Solaris 2 Threads (Cont.)
- User-level thread
- Thread ID, register set, stack, and priority
- User-level data structures
- LWP
- A register set for the user-level thread it is
running - Memory and accounting information
- Kernel data structure
- Kernel thread
- Kernel registers, pointer to the LWP, priority
and scheduling information, and a stack
32Graph Models for Process Representation
- Use graphs to model the synchronization/
communication among processes
33Synchronous Process Graph
- Direct Acyclic Graph (DAG)
- Show explicit precedence relationships and a
partial ordering of a set of processes - Can be used to analyze the makespan (total
completion time) of a set of cooperating
processes - Directed edges a synchronous communication of a
sent or received messages - Results produced by a process are passed to a
successive process as input - Communication transaction happens and is
synchronized only at the completion of a process
and the beginning of its successive processes
34Asynchronous Process Graph
- Indicate the existence of communication paths
- Can be used to study processor allocation for
optimizing interprocessor communication overhead - Nothing is said about how and when communication
occurs - Communication scenarios
- One-way send messages and expect no reply (ex.
Broadcast) - Two-way make a request and receive a reply
- Client/server, master/slave
- Peer symmetrical two-way exchange of messages
35Graph Models for Process Interaction
36Time-Space Model for Interacting Processes
- Communication paths and the precedence relations
between the events and actual communication are
explicit - Can derive the graph models for process
interaction
37Client/Server Model
- A programming paradigm that represents the
interactions between processes and structures of
the system - Server processes that provide services
- Client processes that request services
- A client and a server interact through a sequence
of requests and responses - A client requests a service from the server and
blocks itself - The server receives the request, performs the
operation, and returns a reply message to the
client - The client then resumes its execution
- Only underlying assumption synchronous
request/reply exchange of information
38Client/Server Model (Cont.)
- Message is not interpreted by the system
- High-level communication protocols between
client/server can be built on top of the request
and reply messages - Service-oriented communication model
39Client/Server Model (Cont.)
Client and Server Communication Model
RPC Communication
Message Passing Communication
Connection-oriented or connectionless Transport
service
- Processes need only a single type of system call
to the kernel (send/receive) - Simple and uniform
- How to locate servers binder or agent servers
- Identified by name or service function
40Client/Server Model (Cont.)
- System service categories primitive, system,
value-added - Implement system services as service processes
and move them out of kernel whenever possible - Kernel size can be reduced ? easy to port
- Processes need only a single type of system call
to the kernel (send/receive) - Simple and uniform
- How to locate servers binder or agent servers
- Identified by name or service function
- Horizontal or vertical partition of servers
- Horizontal group disjoint servers
- Vertical allow servers to call on other servers
that are sitting immediately below them
41Time Services
42Overview
- Time - relative measure of a point in time
- clock is used to represent time
- Timer - absolute measure of time interval.
- Used to describe occurrences of events in three
ways - When an event occurs
- How long it takes
- Which event occurs first
- Physical clock and logical clock
- Physical clock - approximation of real-time that
measures both a point and intervals of time - Logical clock - preserves ordering of events
43Overview (Cont.)
- In a distributed system, events are recorded
w.r.t each process own clock time - The clocks of separate machines run at their own
paces - Given two events, which one happens earlier than
the other?
An simple example about why global time consensus
is necessary -- make
44Physical Clocks
- Distributed time service architecture
- Time clerk (TC)
- Time server (TS)
- provide time service
- Maintain up-to-date clock information
- Universal coordinated Time (UTC)
- NIST (http//www.boulder.nist.gov/timefreq/service
/time-computer.html) - ACTS modem service
- WWV short wave radio
- GPS
- Issues
- Compensating Delay
- Calibrating Discrepancy
45Physical Clocks (Cont.)
46Physical Clocks (Cont.) -- Compensating Delay
- TS and UTC
- Calculate delay accurately if distance to UTC and
signal propagation speed known - TS and TS, TC and TS
- Time servers may exchange time information so
more consistent than clients - Network communication delay (larger than signal
propagation delay) - Assume symmetric network traffic
- T(receive)-T(sent)-T(process) ltround-trip delaygt
to adjust time - UTC_received 0.5 ltround-trip delaygt
47Physical Clocks (Cont.) -- Calibrating Discrepancy
- Discrepancy may occur
- A time server may exchange time information with
other time servers - A time clerk may receive UTC from many time
servers - Adjust UTC using previously agreed decision
criteria - Based on the maximum, minimum, median, or average
of UTCs - If average is used, outliers are ignored
- If time server reports an interval of time,
- is the statistical indicator of inaccuracy
- See Figure 3.9
- UTC intervals that dont overlap with others are
thrown away - Intersection that include the most UTC sources is
computed - The new UTC is set to the midpoint of the
intersection
48Figure 3.9 Averaging UTC Intervals
discarded
UTC1
UTC2
UTC3
UTC4
UTC5
New UTC
49Physical Clocks (Cont.) Other Issue
- Pull VS Push service model
- Pull (passive TS) accessing UTC by clients
from a time server - Push (active TS) TS broadcasts to clients
proactively - No easy way to determine the network delay
- Suitable for systems with multicast hardware
where message delay time is shorter and more
predictable - Suitable for communication among time servers
- Clock time cannot be set back (forward) abruptly
since it may contradict the time of some earlier
events
50Logical Clock
- In some applications, only the ordering of event
execution is of concern - Physical clocks can be used to tell if an event
happens before another unless they occur very
closely - Use logical clocks to indicate the ordering
information for events - Lamports logical clock
- Happens-before relation to synchronize the
logical clocks between two events a?b a
happens before b - If a and b are events in the same process, and a
occurs before b, then a?b is true - If a is the event of a message being sent by one
process, and b is the event of the message being
received by another process, then a?b is also true
51Logical Clock (Cont.)
- Happens-before (Cont.)
- For every event, a, we can assign it a time value
(logical clock) C(a) on which all processes agree - If a?b, then C(a) lt C(b)
- The logical clock C must always go forward
(increasing) - If a?b within the same process, then C(a) lt C(b)
- If a is the sending event of Pi and b is the
corresponding receiving event of Pj, then Ci(a) lt
Cj(b) (Consider sending and receiving messages
are events) - If a?b and b?c, then a?c (Transitive)
- If two events, x and y, happen in different
processes that do not exchange messages (directly
or indirectly), then x?y is not true, but neither
is y?x - Concurrent events or disjoint events
- If a?b, events a and b are called causally
related - Lamports algorithm for assigning times to events
(next slide)
52Lamport Timestamps
//my_TS local time stamp of a processor
53Lamport Timestamps (Cont.)
Each message carries the sending time, according
to the senders clock.When a message arrives and
the receivers clock show a value prior to the
time the message was sent, the receiver fast
forwards its clock to be one more than the
sending time
54Lamport Timestamps (Cont.)
58
a,40
42
b,45
52
c,60
d,20
e,50
81
f,60
55
45
57
g,50
h,75
80
52
56
- a?e, e?c, e?h
- Concurrent process (b, e), (f, h), (f,c)
55Lamport Timestamps (Cont.)
Logical Ordering of Events Using Counters
56Lamport Timestamps (Cont.)
Logical Ordering of Events Using Physical Clocks
Process 1
Process 2
Process 3
Physical EventClock
Physical EventClock
Physical EventClock
10 20 c 40 d
10 e 20 f
10 a 20 b
57Logical Clock (Cont.)
- Happens-before only imposes partially ordered
event graph - For two concurrent events, a and b,
- C(a) lt C(b) does not imply a?b
- It is possible that C(a)C(b) (for events in
different processes) - For total ordering of events
- For all events a and b, C(a) ? C(b)
- Concatenate logical clock with a distinct process
id number
58Lamport TimeStamps with Total Ordering
59Vector Logical Clock
- The preceding logical clock scheme cannot tell
whether event a actually happened before event b
if C(a) lt C(b) - If a?b then C(a) lt C(b) True
- If C(a) lt C(b) then a?b False (maybe
concurrent) - Vector logical clock differentiate
causally-related and disjoint events. Vector
logical clock for event a at process i as
VCi(a)TS1, TS2,, Ci(a),, TSn - TSk is the best estimate of the logical clock
time for process Pk - Ci(a) is the logical clock time of event a in Pi
- When sending a message m from Pi (event a) to Pj
, the logical timestamp of m, VCi(a) , is sent
along with m to Pj (event b) - VCj(b) is updated such that TSk (b)max(TSk(a),
TSk(b)) - Logical clock of Pj is also incremented
60An Example of a Vector TimeStamp
(0,0,1,0)
(1,0,0,0)
(0,1,0,0)
(0,0,1,1)
(1,2,0,0)
(0,1,2,0)
(0,0,1,2)
(1,3,0,0)
(2,3,0,0)
(0,1,3,2)
(1,4,0,0)
(3,3,0,0)
(0,0,1,3)
(3,5,0,0)
(4,3,0,0)
(0,1,4,2)
(5,3,1,3)
61Vector TimeStamp Algorithm
62Comparison of Vector TimeStamps
63Vector Logical Clock (Cont.)
400
P1
270
P2
020
P3
0,0,1.5
- If a?b, then VCi(a) lt VCj(b) TSk(a)ltTSk(b) for
every k and TSj(a)ltTSj(b) - If VCi(a) lt VCj(b) then a?b other a and b are
concurrent - (a, e, h) are causally related (b,f) are disjoint
- Extend vector logical clock to matrix logical
clock (3.4.4)
64Matrix Logical Clocks
- A matrix clock MCik,l at Pi is an nn matrix
that represents logical time by a vector of
vector logical clocks - MCii,1..n is the vector logical clock of Pi
- MCij,1..n is the knowledge Pi has about the
vector logical clock of Pj - Updating rule
- local event MCii, i MCii, i d
- sending message from Pi to Pj TSi ? MCi
- MCjj, l max (MCjj, l, Tsii, l) l
1.. n - Maintain causal order of the events
- MCjk, l max (MCjk, l, Tsik, l) k
1.. n, l 1.. N - Propagates the knowledge about other processes
- Application garbage collection in replica
management
65Matrix Logical Clocks Example (Figure 12.19)
1
1
2
1
2
3
1
2
2
4
3
5
3
3
4
4
6
66Clock Application At-Most-Once Message Delivery
- Traditional approach each message bears a unique
ID, and server stores all the message IDs in a
table - The table is lost if server crashes. How long to
keep the IDs? - Using time messages carry connection ID and a
timestamp. The server records the most recent
timestamp it has seen - Any incoming message for a connection is lower
than the timestamp stored for that connection, it
is rejected as duplicate - Remove old timestamp Timestamps older than G are
removed - G CurrentTime MaxLifeTime MaxClockSkew
- Every , the current time is written to
disk - When server crashes, it reloads G from the time
stored on disk and increments it by the update
period,
67Clock Application (Cont.) Clock-Based Cache
Consistency
- Cache consistency in distributed file system
- Usual solution distinguish caching for reading
and writing - Server has to ask reading clients to invalidate
the copy, even the copy was made hours ago (extra
overhead ? reduced by clocks) - Basic idea server gives client a lease about how
long the copy is valid - If a lease expires
- Confirm the copy is current
- Ask for renew if needed (copy cannot be used)
- If a client wants to write on the file
- Ask the readers to prematurely terminate their
leases - If clients crashes, server wait until the lease
times out
68Clock Application (Cont.) Totally-Ordered
Multicasting
Update 1
Update 2
- The two update operations should have been
performed in the same order at each copy - Require totally-ordered multicast
- A multicast operation by which all messages are
delivered/processed in the same order to each
receiver - Lamport timestamp can be used to implement
totally-ordered multicast in a completed
distributed fashion (Chap. 4)
Update 1 isperformed before Update 2
Update 2 isperformed before Update 1
ReplicatedDatabase
69Clock Application (Cont.)
- Time out tickets (tokens) used in distributed
system authentication - Handle commitment in atomic transaction