Title: Presented by Kunmun Garabadu
1Presented byKunmun Garabadu Roney Philip
- RealTime Communication
- -Paulo Verissimo
2Real time communication
- To achieve real-time communication
- Real time protocols
- Real time networks - timely and reliable
- Characteristics of real time communication
- Known and bounded msg delivery
- Deterministic behavior in the presence of
disturbing factors - Recognition of latency classes
- Connectivity
3Real time networks
- LAN or MAN
- LAN
- Small scale
- Reliable to very reliable
- Span a few 1000 ms
- Round trip times 10-5 to 10-1 secs
4Reliability Strategies
- Faults lead to
- Lost messages
- Delays
- Corrupted contents
- Solution
- Space redundancy - replicated hardware
- Mandatory for critical systems like flight
control - Time redundancy - message repetition
5Reliability Strategies
- Space redundancy Cons
- High cost of hardware
- Complex
- Time redundancy Cons
- Communication reliability low for real-time
applications - Which methods and techniques to use?
- Ask 2 questions
- Can we reliably obtain real time behavior out of
simplex( non- replicated) networks? - Which protocols and QoS to use?
-
6Reliability Strategies
- Solution to 1
- Combination of simplex standard LANs
- Space redundancy in physical layer
- To maintain connectivity
- Protocol time redundancy
- Protocols see only one LAN controller
- Solution to 2
- For reliability of communication
- Error masking
- Error detection and forward recovery
- Error detection and backward recovery
7Error masking
a) space redundancy
b) time redundancy
- Assume bounded number of failures, say k, from a
particular component - Have more than k channels
- Have more than k transmissions
- Mask k failures
8Error detection Forward recovery
- For periodic real time communication
- Relationship between consecutive measurements
- Possible to skip a lost msg
- Wait for the next msg
use previous value
refreshed V(t3)
V(t1)
1
2
3
k 1
Maximum period without refreshing
a) Forward recovery
9Error detection Backward recovery
- Ack based protocol
- Restarts when a msg is lost
- Appropriate when msgs cannot be lost
Timeout
k 1
b) Backward recovery
10Making real-time LANs reliable
- LANs have to display real-time behavior
- Obtained by
- Establishing a model
- Traffic patterns
- Reliability and timeliness requirements
- Failure assumptions
- Service and interface definition
- Dressing the elementary LAN with hardware and
software to comply with requirements
11Abstract LAN Model
- We need LAN interfacing to be LAN independent
- Standardisation bodies achieved this through LLC
- But no services in LLC aims at real-time,
reliability etc - So we devise a complete model overcoming these
problems - Using some of the properties of LAN to implement
protocols
12Abstract LAN Properties
- An1 Broadcast
- An2 Error Detection
- An3 Network Order
- An4 Full Duplex
- An5 Tightness
- An6 Bounded Transmission Delay
- An7 Bounded Omission Degree
- An8 Bounded Inaccessibility
13Real time communication requirements
- LAN components display following failures
- Timing failures
- Omission failures
- Network partitions
- Definition of reliable real time network
- RT- A reliable real-time network displays
bounded and known message delivery delay, in the
presence of disturbing factors such as overload
or faults
14Real time communication requirements
- Some networks recognize urgency
- Urgency classes
- Critical or hard real-time
- Best-effort or soft real-time
- Background or non real-time
15Solution to real-time communication requirements
- Enforce bounded delay from request to
transmission of a frame given the worst case
conditions assumed (avoid timing failures) - Ensure that a message is delivered despite the
occurrence of omissions (tolerate omission
failures) - Maintain connectivity (control partitions)
16Enforcing Bounded Transmission Delay
- An6 not guaranteed
- Factors to take into account
- Traffic patterns
- Latency classes
- LAN sizing and parametrising
- User-level load/flow control
17Traffic patterns
- Designer must model the traffic offered to the
network - Aperiodic traffic
- No guarantees about transmission delays
- Cyclic traffic defined by period
- Sporadic traffic bursty
18Latency classes
- Traffic separation in latency classes
- Highest criticality traffic should be given
lowest latency class - Should be given certain amount of channel
bandwidth to fulfill latency requirements - Enforce a given transmission time bound for every
sender
19LAN sizing and parametrising
- LAN sized and parametrised to comply with aimed
bound or vice-versa - Aimed latency not achievable with offered load
- Consequences
- Latency goes up
- number of nodes and/or their offered load go down
- Sending node reduces its traffic demands
- Iterative procedure
20User level load/flow control
- Flow based load control delays transmissions
- Role of real-time load control
- Regulate global offered load
- Throttle individual traffic
- Sporadic event class has bound for
- Interarrival rate
- Burst length
- Burst rate
21Burst period
Burst length
Minimum interarrival time
Average interarrival time
Fig Timing pattern of sporadic events
22User level load/flow control
- Rate based flow control
- Calculate average interarrival rate
- Manipulate the rate at which data is sent
- Smoothens the bursty nature
- Rate should not go smaller than average
interarrival rate
23User level load/flow control
- Load control mechanisms
- Rate control
- Suited for periodic and sporadic traffic
- Matches senders and recipients capabilities
- No discontinuities in traffic flow
- Credit control
- Allocates recipients some credits
- When credit is over, recipient refuses to accept
more information - Improved scheme look ahead credit request or
supply
24Handling Omission Failures
- Characterstics of omissions in a LAN
- Omissions are rare.
- They can occur in bursts.
- Are usually the result of failure of a single
component. - Omission Degree It is the number of consecutive
omissions produced by a component. - An7 Bounded Omissions Degree. In a known
interval Trd, omission errors may affect at most
k transmissions. This feature serves as the
foundation of basic error processing protocols
with deterministic termination. This is important
for real time operation.
25Transmission-With-Reply
- tries 0 resp empty
- do tries lt nrTries resp ! full -gt
- resp empty
- Tx(data, id)
- waitRepliesPutInBag(TwaitReply, resp)
- tries tries 1
- od
26Diffusion
- tries 0
- do tries lt nrTries -gt
- Tx(data, id)
-
- tries tries 1
- od
27Tx-with reply
- Optimal for average case where error rate is
expected to be low - Only one try in absence of errors
- Identifier id allows to distinguish between
duplicate messages. - It aims for a completely correct series
- It allows for complete order among competing LAN
transmissions.
28Diffusion
- At least one instance of the message reaches
every node - It repeats transmission k 1 times.
- Both algorithms execute within a bounded time in
absence of partitions
29Comparision of Algorithms
Features Tx-with Reply Diffusion
Worst-case delivery delay k.TwaitReply Ttd (k1).Ttd
No fault delivery delay equal equal
Processing overhead highest
Scalability equal equal
Network load highest
30Comparision of Algorithms
Features Tx-with Reply Diffusion
Total order possible not possible
Failure Detection yes no
Upper layer inform in reply frame possible not applicable
Resilence to lack of coverage high none
Processing overhead highest
31Inaccessibility
- RT Maintain connectivity
- An8 Bounded Inaccessibility. In a known
interval Trd, the network may be inaccessible at
most i times with a total duration of at most
Tina. - Network is partitioned into subsets of nodes that
cannot communicate. - Causes of partition bus medium failure, ring
disruption, transmitter or receiver defects,
token loss etc. - Controlling partition Solution is in knowing
how long a partition lasts. This should be
sufficiently small so that the service can be
carried on effectively - Inaccessibility Period of time for which the
partition lasts.
32Inaccessibility Control
- How to implement inaccessibility control ?
- Instrument the LAN to recover from all conditions
leading to partition - Have a bound for number and duration of
inaccessibility periods - Accommodate inaccessibility in the protocols and
timeliness calculations. - Determine the upper bound for recovery from
partitioning - The upper bound may be dependant on operating
situation specific to each LAN. - If network is properly managed and parameterised
inaccessibility figures can be drastically
reduced.
33Inaccessibility in Timeliness Model
- Inaccessibility must be accounted in the
following - Calculations of real worst case execution times
- Dimensioning of timeouts
- Synchronous real-time operation of LAN
- Tina has to be added to the real worst-case
execution time of protocols - The protocol may fail if it times out too early
but inaccessibility occurs. - Including Tina in time-outs is a sufficient
condition for running synchronous operation - Tina may be much greater than Ttd causing
timeouts to be undesirably long.
34- Better to take inaccessibility off from the
time-outs - Methods to remove inaccessibility
- Timer Freezing
- Inaccessibility is detected
- All timers used in time-outs are suspended
- Timers are restarted when the network becomes
accessible - Inaccessibility Trapping
- Each inaccessibility period inside two
consecutive transmission signals from the LAN are
trapped This avoids more than one timeout per
inaccessibility period. - Each inaccessibility occurrence counts as one
omission. - Extra omissions have to be added in the retry
count of the low level protocols.
35LAN Redundancy
- Enforcement of bounded omission degree and
bounded - inaccessibility can be obtained through
redundancy in the physical and medium layers - FDDI has a dual-reconfiguring ring capable of
surviving just one interruption. - Token-bus and Ethernet have no standardised
redundancy. - Extra measures have to be implemented to survive
multiple failures.
36Dual Media Token Bus LAN
Higher-level protocols
Medium-Access Control VLSI
Selector State Machines
Physical layer
Physical layer
Dual Media Token Bus LAN
37Addressing
- Efficient and timely to meet real-time
requirements. - Reception of frames not addressed to anyone in
the node has to be avoided - Frame addressing involves the following
- Construction of the address at frame transmission
- Interpretation of the address of the passing or
received frame - Address formats correspond to (typeaddressing
mode) - Type performs the first step in selection it
points to a set of possible filters - Mode selects the appropriate filter.
38Addressing
- Classification of several addressing modes
- Individual It enables a sender to address a
particular station by its physical address. - Broadcast It enables a frame to be accepted in
all nodes. - Logical It is intended to address a given group
of nodes identified by a n-bit gate address
independent of their location and number. - Selective It consists of a n-bit binary chain
but each of the bits represents a node. The
association between a station and a bit can be
static or dynamic.
39Processor Group Membership
- It provides a map of the nodes belonging to the
group. - It is independent of higher level groupings of
processes. - It maintains an Active Stations Table (AST)
- AST provides the station ordering and a basic
mask where - stations are marked up or down
ST1 ST2 ST3 ST4
up up up down
40Processor Group Membership
- Categories of events that PGM responds to
- Insert/Delete,
- Join/Leave,
- Failure
- PGM functions
- Maintenance of AST Responds to insert/delete
requests - Provision of Short Addresses Reference a node
by its positionin the AST - Failure and Group Change Handling Acts upon
suspicion of failure that may come from a network
driver, group communication protocol etc - Information about group members Can respond to
a number of requests regarding group members.
41Clockless PGM Protocol
- Delta-4 System
- A GroupChangeEvent for join,leave or failure
cases triggers the protocol. - In case of failure, a component detecting failure
issues the check request. The node requests the
other members state. - The node gets replies and constructs the new AST.
It sends it out to members. This is done using
Tx-with-Reply to make sure all members install
the new table. - The first message locks the table so that
competitors are left out - With omissions more than one competitor may lock
subsets of the nodes - Each of them retries incrementing a lock_level
counter until one of them locks all nodes
successfully and then proceeds
42Clockless PGM Protocol
Compute station table
Group change event
GetState(and lock)
NewState (unlock)
My state
Installed
a) StationTableOps Insert, Delete, Down, Up
43Clock-driven PGM Protocol
- AAS System
- Two events trigger the protocol Upon request
like join or passage of time - Periodically membership management is done to
ensure changes are detected in bounded time - Group communication is through diffusion. Only
way to detect failures is through such a
protocol. - All processors diffuse an Im alive message so
that each and everyone will build the same view
of processors alive.
44Time-Triggered PGM Protocol
- MARS System
- Periodically all nodes broadcast their message
- Each message is sent twice to overcome omission
- Each processor listens to all transmissions
making a vector of dimension N, where N is the
number of nodes. Vu,v is a boolean which is true
when processor u saw a valid message from
processor v - Vector V is then sent in the following period
transmission.All processors receive N vectors - A matrix is built which is as follows
- Each column u accounts for the messages Pu saw
from all others - Each row v accounts for the messages from Pv seen
by all the others
45Time-Triggered PGM Protocol
- This protocol detects failures with one cycle
delay at most. - Matrices may not be equal in all nodes.They
guarantee to have enough information to
deterministically detect a failed processor. - A failed processor is one that fails to transmit
both copies of its message to all or fails to
receive both copies of another nodes message -
- P1 V2,1 V3,1 V4,1
- P2 V1,2 V3,2 V4,2
- P3 V1,3 V2,3 V4,3
- P4 V1,4 V2,4 V3,4
-
46Summary
- Real time communication
- Real time networks
- Real time protocols
- Real time networking and reliability policies
- Making real-time networks reliable and timely
- Bounded transmission delays
- Handling failures
- Inaccessibility
47Summary
- Low level protocols assist high level protocols
in attaining - Transmission reliability
- Selective and logical addressing