Performability Modeling and FaultTolerant Communication Systems - PowerPoint PPT Presentation

1 / 91
About This Presentation
Title:

Performability Modeling and FaultTolerant Communication Systems

Description:

Std. Uniformization. Fox-Glynn Method. Stiff Uniformization. Batch Means. Regenerative Simulation ... more complete and balanced picture. Both steady-state and ... – PowerPoint PPT presentation

Number of Views:192
Avg rating:3.0/5.0
Slides: 92
Provided by: valueds233
Category:

less

Transcript and Presenter's Notes

Title: Performability Modeling and FaultTolerant Communication Systems


1
Performability Modeling and Fault-Tolerant
Communication Systems
MURI Review _at_ Berkeley, June 2001
  • Dr. Kishor S. Trivedi
  • Dr. Yonghuan Cao
  • Center for Advanced Computing and Communication
    (CACC)
  • Dept. of Electrical and Computer Engineering
  • Duke University
  • Email kst_at_ee.duke.edu

2
Agenda
  • Introduction
  • Motivation, Objective and Methodology
  • Accomplishment
  • Performability modeling of wireless mobile
    networks
  • Multicast Logical Information Feedback Tree
    (LIFT)
  • Progress in TCP performance modeling w/ SDE
  • Looking Forward
  • Ongoing and Future work
  • Potential Connections w/ others
  • Conclusion

3
Introduction
MURI Review _at_ Berkeley, June 2001
  • Overview of research at Duke CACC, our
    motivation, objective methodology

4
A Research Triangle
Theory
SRN MRSPN FSPN NHCTMC SDE
Applications
Tools
Real-time Systems Fault-tolerant Systems Computer
Networks Wireless Communication
SPNP SHARPE SREPT
5
Stochastic Modeling
  • Stochastic Processes
  • Discrete-time Markov Chains (DTMC)
  • Continuous-time Markov Chains (CTMC)
  • Semi-Markov Processes (SMP)
  • Markov Regenerative Process (MRGP)
  • Stochastic Formalisms
  • Stochastic Reward Nets (SRN)
  • Markov Regenerative Stochastic Reward Nets
    (MRSRN)
  • Fluid Stochastic Petri Nets (FSPN)
  • Stochastic Differential Equations (SDE)
  • Automated Tool
  • Stochastic Petri Net Package (SPNP)

6
Architecture of SPNP
Stochastic Reward Net (SRN) models
Markovian Stochastic Petri Net
Non-Markovian Stochastic Petri Net
Fluid Stochastic Petri Net (FSPN)
Reachability Graph
Analytic-Numeric Method
Discrete Event Simulation (DES)
Steady-State
Transient
Steady-State
Transient
SOR
Std. Uniformization
Batch Means
Indep. Replication
Gauss-Seidel
Fox-Glynn Method
Regenerative Simulation
Restart
Fast Methods
Power Method
Stiff Uniformization
Importance Sampling
Importance Splitting
7
What Degrades Service?
Resource limit Channels, Buffer, Bandwidth,
Long waiting-time, Time-out, Service blocking,
Resource FULL
Outage-recovery Failures, Upgrades, Maintenance, H
uman-errors,
Incomplete service, Loss of information,
Resource LOSS
8
Need Performability Modeling
  • New technologies, services standards needs new
    models
  • Traditional performance model may not be
    applicable without proper treatment
  • Pure performance modeling too optimistic!
  • Outage-and-recovery behavior not considered

Performability modeling Performance
Availability Performability A more complete and
balanced picture Both steady-state and transient
solutions are informative
9
Accomplishment-1
MURI Review _at_ Berkeley, June 2001
  • Performability Modeling of Wireless Mobile Systems

10
Wireless Mobile Challenges
  • Restricted Spectrum
  • Scarce bandwidth ( 10Kbps 100Kbps 4Mbps )
  • Error-prone link
  • Channel fading, multiple path, building blocking,
  • High mobility
  • Needs mobility management
  • Complex distributed location DBs
  • More complicated by data services mobile IP
  • Service diversity
  • Traditional voice/paging
  • Increasing demand for data services
    (email,stock,www, )

11
Topics Studied
  • Performability of control channel protection in
    cellular system
  • Uplink performance of wireless packet-switched
    data (A 2.5G system, GPRS)
  • Performance of wireless downlink scheduling
    policies
  • The performance impact of access delay to
    capacity-on-demand multiple access

12
Performability Modeling and Optimization of
Cellular Systems with Control Channel Failure and
Automatic Protection Switch (APS)
  • Y. Cao, H.-R. Sun and K. S. Trivedi,
    Performability Analysis of TDMA Cellular Systems,
    PQNet2000, Japan, Nov., 2000.
  • H.-R. Sun, Y. Cao, K. S. Trivedi and J. J. Han,
    Method and Apparatus for control channel
    restoration in cellular systems, patent filed,
    2000

13
A TDMA Cellular System
  • Each cell has Nb base repeaters (BR)
  • Each BR provides M TDM channels
  • One control channel resides in one of the BRs

14
Traffic In a Cell
Common Channel Pool
A Cell
15
Automatic Protection Switch (APS)
  • Upon control_down, the failed control channel is
    automatically switched to a channel on a working
    base repeater.

16
Performance Measures
  • New call blocking probability, Pb
  • Percentage of new calls rejected
  • Handoff call dropping probability, Pd
  • Percentage of calls forcefully terminated
    crossing cells
  • Channel utilization, Uc
  • Fraction of time in which available channel
    resource is in use

Pb, Pd, and Uc are determined not only by system
parameters (such as no. of channels, call
admission control scheme, etc.), but also
incoming traffic characteristics and call
duration distributions.
17
Model of System w/o APS
CTMC State (b,k) bNo. of BR up kNo. of
talking channels
18
Model of System w/ APS
CTMC State (b,k) bNo. of BR up kNo. of
talking channels
A Segment of the Composite Markov Chain Model
19
Numerical Results
Handoff Call Blocking Probability Improvement
by APS
Unavailability in handoff call dropping
probability
20
Packet-level Performance Analysis of ALOHA
Reservation-based MAC in GPRS under Bursty Data
Traffic
Y. Cao, H.-R. Sun and K. S. Trivedi, Performance
Analysis of Reservation-based Media Access
Protocol with Access Queue and Serving Queue
under Bursty Traffic in GPRS/EGPRS, Wireless
Network (in review), January, 2001.
21
Background
  • GPRS, a 2.5G system, to evolve todays TDMA-based
    GSM and tdmaOne towards 3G.
  • Circuit-switched voice and packet-switched data
    services coexist. Voice has higher priority.
  • Capacity-on-demand concept and multi-slot
    capability. Theoretical data rate up to 172 kbps.

22
Uplink Data Transfer
  • Slotted-ALOHA Reservation Protocol
  • Capture capability to reduce collision
  • Access queue (AQ) to alleviate contention
  • Serving queue (SQ)
  • Cross the TDMA frame boundaries, dynamic channel
    allocation
  • Bursty data traffic

23
GPRS/GSM Architecture
PSTN
MSC/VLR
HLR
BSC
SS7
BTS
EIR
MS
24
Protocol Stack Segmentation
Application
IP/X.25
SNDCP
LLC
RLC
MAC
GSM RF
Mobile terminal
25
The SRN Model
LLC arrival on-off
Finite buffer Connection
pmf of LLC frame size
The tagged mobile
The rest (N-1) mobiles
26
Model Accuracy
Simulation 95 CI Written in C
SRN Model Using SPNP
27
Components of Frame Delay
  • Waiting time in access queue dominates delay (due
    to limited channel).
  • Contention delay negligible due to AQ and
    capture.

28
Performance of Queue Length Channel Quality
Based Wireless Scheduling Policies
Y. Cao, H.-R. Sun and K. S. Trivedi, Performance
of queue length and channel quality based
wireless scheduling policies, CACC Technical
Report, March, 2001.
29
The Problem
A
A Scheduling Scenario
B
Wired Network
C
c
a
b
In one time slot, only one of the three downlink
streams (A-a, B-b, C-c) is allowed to transmit!
Which to choose?
30
Another Look
Scheduler
a
b
c
Base Station
Incoming Traffic
Wireless Link
Terminals
31
Harder Than Wire-line
Wire-line scheduling always assumes error-free
links w/ high bandwidth.
c
a
b
  • Wireless Link
  • High error rates / bursty errors
  • Location-dependent capacity
  • Time-varying link quality
  • Very low bandwidth

Wireless scheduling needs to consider
time-varying channel quality.
32
A Quality-aware Scheduler
a
  • Two Schedulers
  • Naïve Round Robin
  • (NRR)
  • Best-Quality-First
  • (BQF)

Link Capacity _at_ t
b
c
time
NRR
Throughput under backlogged traffic
BQF
BQF Throughput Optimal!!
33
Problem with BQF
Starvation may occur to queues with low average
quality.
Good channel
Bad channels
Queues with bad channels blow up.
A scheduler needs to take into account not only
link quality but also queue length.
34
GWQL Scheduling
q1(t)
m1(t)
q2(t)
m2(t)
qn(t)
mn(t)
Generalized Weighted Queue Length (GWQL)
Scheduling Define score wi zi qi(t) mi(t), zi
gt 0 In each time slot, data is transmitted to the
mobile with the highest score. In case of tie,
one of them is randomly chosen.
35
What GWQL Can Be?
GWQL (Generalized Weight Queue Length)
wi zi qi(t) mi(t)
zi 1
WQL (Weighted Queue Length)
mi(t) m
qi(t) q
BQF (Best-quality-first)
LQF (Longest-queue-first)
36
What Makes a Good Scheduler?
Ideal Scheduler
GWQL
Simplicity Flow Isolation Optimal Throughput Uti
lization Heterogeneous QoS Guarantee
Yes. Only q-length and link-status! Depend on
traffic pattern? and channel variation? Yes.
Throughput Optimal! Tassiulas92, McKeown96,
Wasserman97 Need to set zi properly?
37
Need A Performance Model
  • To study the performance impact of traffic
    burstiness and channel variation.
  • To evaluate the capability of satisfying
    heterogeneous QoS requirements.

38
Traffic Model
  • Traffic Model
  • Markov Modulated Poisson Process (MMPP) FMH92
  • Able to capture inter-arrival correlations
  • Able to characterize traffic burstiness
  • Yet still analytically tractable!

OFF
ON
ON
A 2-state MMPP traffic model
39
Channel Model
  • Bursty-error Channel Model
  • The well-known Gilbert-Elliot Model Gil60
  • A two-state Markov chain (Good Bad)
  • Extension to the GE model
  • Finite-state Markov channel Wang95
  • Model parameters can be derived from channel
  • fading distribution and mobile speed.

a0
Bad
m2
Good
m1
a1
40
A Stochastic Petri Net Model
Building the Markov chain by hand is tedious and
not necessary. We use stochastic Petri net (SPN).
L
a0
s0
a1
a1
m (g)
l
Link model
Two-state MMPP Traffic
Finite Buffer
GWQL scheduling policy is embodied in the guard
function (g).
41
Measures of Interest
  • Blocking Probability
  • The probability that an arriving packet sees a
    full queue.
  • Packet Delay
  • The response time experienced by a packet
    accepted to the queue.
  • Individual and System Throughput
  • The amount of packets transmitted in a time unit.

42
Burstiness Effect
Blocking Probability
q1(t)
Poisson
m1(t)
q2(t)
MMPP
m2(t)
Same channels
Burstiness Measure of MMPP Index of Dispersion
for Counts (IDC) IDC 1 for Poisson IDC gt 1 for
MMPP
Burstiness (IDC)
43
Burstiness Impact
Packet Delay
Throughput
Burstiness (IDC)
Burstiness (IDC)
44
Channel Variation
Blocking Probability
q1(t)
Poisson
m1(t)
Poisson
q2(t)
m2(t)
Measure of Channel Variation Squared coefficient
of variation C2m Varm/Em2 C2m 1/a1 Bad
Duration, if Em fixed
Bad Duration
45
Effect of Channel Variation
Throughput
Packet Delay
Bad Duration
Bad Duration
46
Tuning GWQL
Performance of an individual mobile is
bounded Upper bound when zi ? very large,
the queue always has the highest priority,
served whenever queue is not empty. Lower
bound when zi ? very small, the queue always
has the lowest priority, served only when all
other queue are empty. Each bound is also
determined by the channel.
47
QoS Tuning Capability
Blocking Probability
z1
q1(t)
Poisson
m1(t)
Poisson
q2(t)
m2(t)
Identical Channels
z2
z2 /z1
48
GWQL Tuning Capability
Packet Delay
Throughput
z2 /z1
z2 /z1
49
GWQL Conclusion
  • Traffic burstiness not only deteriorates one
    mobile, but also the rest mobiles sharing the
    same link.
  • Traffic regulation is needed for flow
    isolation.
  • Large channel variation has significant negative
    impact to all mobiles.
  • Second-moment channel information may improve.
  • Tuning capability is bounded. Performance
    appears sensitive to the values of zis.
  • The model developed is useful in search for
    proper zi.

50
The Effect of Access Delay in Capacity-on-demand
over a Wireless Link Under Bursty
Packet-switched Data
Y. Cao, H.-R. Sun and K. S. Trivedi,The Effect of
Access Delay in Capacity-on-demand over a
Wireless Link Under Bursty Packet-switched Data,
Performance Evaluation (submitted), March, 2001.
51
Problem Definition
Access Scenario
c
a
b
Radio resource (the number of channels) is
limited. A number of mobile with data to send
compete radio links. A mobile may experience
access delay. How does access delay affect
individual performance?
52
Capacity-on-demand
Todays common wireless data applications (www,
email, stock, )
Traffic A
Traffic B
Call (session) duration
For Traffic A, worth to dedicate a channel for
the entire call duration. For Traffic B, not a
good idea wasting resource in silent periods.
Capacity-on-demand to optimize the utilization
of radio links. Only establish connection when
having data to send, Release connection once data
is emptied.
53
Impact of Access Delay
Traffic
Packet may drop if access too long.
A Access Delay S Service
A
A
S
S
Connection
Access delay may cause buffer overflow, long
waiting-time, etc.
54
Cause of Access Delay
  • Access delay is determined by a strongly coupled
    system
  • The number of mobiles,
  • Traffic pattern on each mobiles (user
    behaviors),
  • Available radio resource (number of channels)
  • The particular multiple access (MA) mechanism

The distribution of access delay is virtually
unknown and can be arbitrarily general.
55
Objective
Random variable A Access Delay Want to
understand 1. How the distribution (shape)
of access delay may affect performance. 2.
Is the mean value EA enough? 3. Can a
simple distribution (such as exponential) be
used for good approximation?
56
A Queueing Model
L
G/Activation
MMPP
MMPP/G/1/L with server activation
Note 1. MMPP arrivals Bursty traffic 2.
Service time (G) Arriving packets of diff.
sizes 3. Server activation (A) Link access delay
57
Exhaustive Principle
Once connection is established, all buffered data
and arrivals during the connection will be
transmitted. Connection is released immediately
after buffer is emptied.
58
Model Analysis
A state (l,s,m) l No. of packets in buffer
(0, 1, , L) s Server off/on (0, 1) m MMPP
state (1, 2, , M)
State-space based approach
59
2 Types of Transitions
MMPP Counting Process
Exponential Transition ?
Server off
Server on
General Transition --?
60
An MRGP
In a semi-Markov process (SMP), state does not
change between two consecutive regenerative
points. When a general transition is enabled,
the exponential transitions (of the MMPP counting
process) keep going on and state may change. The
process is more complicated than an SMP. It is a
Markov regenerative process (MRGP).
61
CTMC ? SMP ? MRGP
CTMC
SMP
state
state
t
t
T Exp
T Gen
T Exp
state
MRGP
t
T1 Gen
T1 Gen
62
MRGP Analysis
Two kernels Global Kernel K(t)
Kij(t) Kij(t) (t) PrY1 j, T1 lt t Y0
i Local Kernel E(t) Eij(t) Eij(t)
PrZ(t) j, T1 gt t Z(0) i Define V(t)
Vij (t), Vij (t) PrZ(t) j Y0
i V(t) E(t) K V(t)
63
Steady-state Solution
1. Steady-state solution of the embedded DTMC
with P K( )
2. The integral
Uniformization method used
3. The steady-state probability vector
4. Measures of interest can be derived from p
64
Measures of Interest
  • Blocking Probability (Pb)
  • The probability that an arriving packet sees a
    full queue.
  • Packet Delay (t)
  • The response time experienced by a packet
    accepted to the queue.
  • Activation Rate (rA)
  • Number of times that the server needs to set up
    per unit time. Overhead of capacity-on-demand.

65
Effect of Access Delay
1. Traffic model two-state MMPP
2. Service time pmf of packet size
  • Distributions of Access delay
  • Exponential
  • 2-stage Erlang
  • 3-stage Erlang
  • Deterministic

66
Blocking Probability
Blocking probability
EA
67
Packet Delay
Mean Packet Delay
EA
68
Server Busy/Idle Time
Mean busy time
Mean Idle time
EA
EA
69
Effect of Traffic Pattern
Comparison of Poisson Arrival and MMPP (Same Ave.
Rates)
Mean Packet Delay
Blocking probability
EA
EA
70
Activation Rate
Poisson
MMPP
71
Access Delay Conclusion
  • A general queueing model with server activation
    is used to study the impact of access delay to
    bursty wireless data applications.
  • Have developed efficient numerical method to
    solve the model.
  • From numerical results steady-state performance
    measures appear not very sensitive to the
    distribution of access delay.
  • Good news for further system-level evaluation.

72
Accomplishment-2
MURI Review _at_ Berkeley, June 2001
  • Logical Information Feedback Tree (LIFT) for
    Many-to-many Distribution

A. Rodriguez, LIFT Logical Information Feedback
Tree for Information Dissemination in Wide-Area
Networks, Master Thesis, May, 2001.
73
The Problem
  • Solutions to native IP multicast problems Router
    Replacement (too costly, x) or Network Overlays
    (v)?
  • Overlay nodes should be organized in a
    hierarchical fashion (such as a tree) to limit
    overlay maximum width and thus reduce
    propagation delay.
  • Applications Content Distribution Networks
    (CDN), application layer multicast, reliable
    multicast protocols, etc.

74
Logical Information Feedback Tree
  • Basis of LIFT
  • Hyper-Chromatic Tree, a distributed parallel
    version of Red-Black Tree (RBT) Messeguer
    Valles 98
  • LIFT beyond Hyper-Chromatic Tree
  • Node synchronization to allow tree resemble the
    underlying substrate network topology
  • Properties
  • Balanced height bounded Olog(n).
  • Decentralized no centralized control/info.
    needed
  • Robust dynamic node insertion/deletion, etc.

75
Preliminary Results
  • A proposal of the LIFT protocol.
  • LIFT Protocol simulator (based on ns-2).
  • Studied --
  • control packet overhead,
  • tree convergence rate,
  • tree transient behavior under network dynamics
    (reaction to node deletion, insertion, etc.),
  • capability of matching substrate network
    topology, etc.

76
Accomplishment-3
MURI Review _at_ Berkeley, June 2001
  • TCP Performance Analysis using Stochastic
    Differential Equations (SDE)

77
TCP Performance (Throughput and Goodput) Study
using Stochastic Differential Equations (SDE)
Yiguang Hong, Yonghuan Cao and K. S. Trivedi, A
Note on TCP Throughput and Goodput, Submitted to
IEEE Communication Letters, April, 2001.
78
TCP Goodput Throughput
  • Previous studies assume backlogged source (May
    not true for interactive connections).
  • Previous studies focus more on throughput, but
    goodput may be more important to a user.

Network
TX
RX
X(t)
Packet Loss (p)
A single TCP connection over an unreliable link
79
An SDE Formulation
  • TCP source traffic X(t) input to buffer Q(t)
  • Packet loss is Poisson (modeled as a Poisson
    counter, N)
  • TCP window size W(t) increase/decrease due to
    packet loss

R Round Trip Time (RTT)
80
The Critical Point
  • There exists a critical point of loss probability
    (p), before and after which TCP perform
    differently.
  • Goodput EX, when p lt p.
  • The well-known formula is only valid after the
    critical point (p).

81
Looking Forward
MURI Review _at_ Berkeley, June 2001
  • Ongoing/future work and potential connections
    with others

82
Wireless Data Networks
Ongoing Work
  • A joint analysis of multiple layers in wireless
    networks has been under way.
  • The accomplished work so far is the basis for
    joint analysis. (GPRS from PHY to LLC)

83
Why Joint Analysis?
  • Low layer high availability and high performance
    (HA/HP) does not necessary mean HA/HP on higher
    layer.
  • For example, even a Gb/s network will crash if no
    effective routing protocol to evenly distribute
    traffic across the network
  • Joint analysis can achieve optimal protocol
    design.

84
Challenges
  • Stiffness
  • Different time scales connection level, burst
    level, packet level, etc.
  • Largeness
  • Different layers different states, state space
    explosion with large number of users
  • Potential solutions
  • hierarchical decomposition, state aggregation,
    layer abstraction, etc.

85
Potential Connections
  • A joint research with Mostafa (Gatech, QoS team)?

86
Performance Modeling Dynamic Voting System
Ongoing Work
  • Performance modeling was ignored in early studies
    due to complexity.
  • Oversimplification in few performance studies of
    dynamic voting systems.
  • We build comprehensive model with the help of
    stochastic modeling tool (SPNP) to study
    different realistic dynamic voting schemes.
  • Probabilistic nature of link/node failures,
    delays, etc.

87
Potential Connection
  • A joint research with Nancys group
  • (MIT, Group Communication Service)?

88
Extension to TCP Performance
Future Work
  • Analytical performance modeling (using FSPN/SDE)
    from single connection to multiple connections
  • Congestion control mechanisms AQM (Active Queue
    Management), RED, etc.
  • Emphasis on fairness, stability analysis, and
    optimization.

89
Extensions to LIFT
Future Work
  • Fault-tolerance in LIFT
  • LIFT is inherently fault-tolerant in detection of
    node disappearance, but need to consider failure
    during a rebalancing operation.
  • Simulation conducted, but need analytical model
    to study performance.
  • Need to study LIFT-based congestion control.

90
Potential Connections
  • A joint research with ?
  • Dr. A. Zakhor
  • Real-time multicast video over packet-switched
    networks Tan Zakhor
  • Dr. K. Shin
  • Feedback synchronization for multicast ABR Flow
    control Zhang Shin

91
The End Thank you!
Write a Comment
User Comments (0)
About PowerShow.com