Title: SIP%20Server%20Overload%20Control:%20Design%20and%20Evaluation
1SIP Server Overload Control Design and
Evaluation
- Charles Shen and Henning Schulzrinne
- Columbia University
- Erich Nahum
- IBM T.J. Watson Research Center
2Session Initiation Protocol (SIP)
- Application layer signaling protocol for managing
sessions in the Internet - Run on top of the transport layer e.g. UDP, TCP
and SCTP - Typical usage voice over IP call setup, instant
messaging, presence, conferencing
3SIP Server Overload Problem
- Many causes to excessive number of messages
overwhelming the server - Natural disaster and emergency-induced call
volume earthquake, - Predictable special events Mothers Day
- Flash Crowds American Idol, Free tickets to the
third caller - Denial of service attacks
- Simply dropping requests on overload?
- SIP has retransmission timers for message loss,
especially over UDP - E.g., Timer A for INVITE retransmission
- T1 500 ms, increases exponentially until total
timeout period exceeds 32 s - Simple message dropping induces more messages due
to retransmission!
4SIP Server Overload Problem (Cont.)
- Rejecting excessive requests upon overload?
- SIP 503 (Service Unavailable) response code used
to reject individual request - Individual sessions are rejected but overall
sending rate is not reduced. - Even worse rejecting requests takes comparable
CPU cycles with accepting requests! - 503 (Service Unavailable) with Retry-After?
- Client completly shut off during the period
specified - Reducing rate with an on/off pattern, may cause
oscillation - Trying an alternative server?
- Alternative server may soon be overloaded too-gt
cascading failure! - Feedback-based SIP overload control
- Sender is instructed by the receiver not to send
more requests than the receiver can accept in the
first place!
5Feedback-based SIP Overload Control
- Absolute rate feedback
- RE estimates and feedbacks to SEs target
controlled load (?) - SE throttles offered load Pb (1-?/?) so actual
load to RE conforms to target load - Key is accurate controlled load estimation
- Relative rate feedback (loss-based feedback)
- RE estimates and feedbacks to SEs a load throttle
percentage Pb based on a target metric (e.g. CPU
utilization, queue length) - SE throttles offered load by Pb to conform to the
target controlled load. - Key is the target metric and the throttle
percentage adjustment algorithm - Window feedback
- RE estimates and feedbacks to SEs a window size
indicates current acceptable num of new calls - SE throttles any new call arrivals while no
window slot available, thus limiting offered load
(?) to the target controlled load. - Key is the maximum window setup and dynamic
window adjustment algorithm
6SIP Overload Feedback Control Design
Considerations Control Unit
- What is a control unit a SIP message, a SIP
session? - Although the signaling is message based, not all
messages carry equal weight - Typical SIP call contains one INVITE followed by
six additional messages - A new INVITE is much more expensive than other
messages - A job or a control unit is defined as a whole SIP
session (e.g. a SIP Call) - How to characterize the end of a SIP session?
- Can we always expect a BYE as an end of a
session? - Easier if we can - full session check approach
- Otherwise, use a dynamic start session check
approach - under normal working conditions, the actual
session acceptance rate is roughly equal to the
session service rate. - estimated session service rate is number of
INVITEs accepted over a unit of measurement
interval - Standard smoothing functions can be applied
7SIP Overload Feedback Control Design
Considerations Dynamic Session Est.
- Often need to know current number of sessions in
the server system - NOT equal to number of INVITE messages in the
system - non-INVITE messages must also be accounted for!
- Proposed Dynamic Session Estimation Algorithm
(DSEA) - Nsess Ninv (Nnoninv / (Lsess-1) )
- Where Lsess is estimated session size (number of
messages per session) - Ninv is number of INVITE messages in the system
- Nnoninv is number of non-INVITE messages in the
system - DSEA holds for both full session check and
start session check approaches. - differ in how the Lsess parameter is obtained.
- full session check checking the start and end of
each individual SIP sessions. - start session check number of messages processed
over number of sessions accepted per unit time
8SIP Overload Feedback Control Design
Considerations- Active Source Estimation and
Feedback Communication
- RE may wish to know number of active sources,
e.g. to explicitly allocate its total capacity
among multiple SEs. - directly tracking and maintaining a table entry
for each current active SE. - each entry has an expiration timer set to one
second. - Feedback Communication
- for SIP overload between servers, in-band
feedback is appropriate - any feedback information is piggybacked in the
next SIP message sending to the corresponding
next hop
9Win-disc Window Control Algorithm
- Principle estimate and adjust the number of
acceptable sessions every control interval - Decrease window upon new session arrival
- Adjust window every control interval Tc
- new available window (W) is the total allowed
number of session in the next interval minus
existing backlog - W µTc
µDB - Nsess - µ current session service rate
- DB budget queuing delay (should be smaller than
the INVITE timer) - Nsess Ninv (Nnoninv / (Lsess-1) ) is current
num of sessions in the system -
- Initial window suggested W0 µengTc where µeng
is the engineered server capacity.
10Win-cont Window Control Algorithm
- Principle continuously keep the estimated number
of existing sessions in the system below a target
number - Decrease window size upon new session arrival
(enqueueing INVITE) - Increase available window size (W) when currently
estimated existing num of sessions is smaller
than maximum allowed num of jobs - W µDB
Nsess - µDB is equal to maximum allowed num of sessions
in the system (max window size) - Nsess Ninv (Nnoninv / (Lsess-1) ) is current
num of sessions in the system - Initial window suggested W0 µengTc where µeng
is the engineered server capacity.
11Win-auto Window Control Algorithm
- Principle simple window adaptation that
automatically slows down when the system is
congested - Decrease window size by one upon new session
arrival (receiving INVITE) - Increase window by one up dequeueing a NEW INVITE
(not a retransmission). - Therefore, window increase is slower than window
decrease - system adapts itself to a steady state w/ a
fairly low dynamic available window - Initial window suggested W0 is a reasonably
large positive value, exact value not important - Biggest advantage simple
12rate-abs Absolute Rate Based Control
- During every control interval Tc, the RE notifies
the SE of the new target load ? - ? µ 1-
(dq - DB ) / Tc - µ the current estimated service rate
- dq Nsess / µ queuing delay at the
last measurement interval where - Nsess is current num of sessions in the
server obtained using our Dynamic - Session Estimation Algorithm
-
- The SE does percentage throttle to limit offered
load to RE within the feedback assignment for
each control interval
Algorithm proposed by Hosein etc.
13rate-occ Relative Rate Based Control
- During every control interval Tc, the RE notifies
the SE of an acceptance ratio f - Adjustment of f is based on the measured
processor occupancy comparing to a budget
processor occupancy ?B - fk and fk1 are acceptance ratios of
current and next control interval - ?k min(?B /?k,?max) and ?k current
processor occupancy - fmin a none-zero minimal acceptance ratio
- ?max max multiplicative increase factor in
two consecutive Tc - In this paper ?max 5 and fmin 0.02
Algorithm proposed by Cyr. etc.
14Simulation Assumptions and Metrics
- Simulator RFC3261 compatible simulator built on
OPNET - Node model
- Each UA represents infinite number of
callers/callees - UAs and SEs have infinite capacity
- RE server configuration service capacity
72 cps, rejecting rate 3000 cps - Traffic model
- Calls from callers on the left to callees on the
right - Exponential interarrival times and call holding
time - Standard seven-message call flow
- Transport and network model
- UDP transport-gt all SIP timers active
- No link delay and loss is assumed
- Feedback method piggybacked in the next
available message to the particular next hop.
15SIP Overload Performance without Any Feedback
Control
- Simple Drop scenario
- message dropped when queue full
- Threshold Rejection scenario
- queue length configured with a high and a low
threshold value. - when queue length high threshold
- new INVITE requests are rejected but other
messages are still processed. - when queue length falls below low threshold
- INVITE processing restored
- Similar congestion collapse but DIFFERENT
reasons - Simple Drop
- one third of INVITE arriving at the callee
- all 180 RINGING and most of the 200 OK also
dropped due to queue overflow. - Threshold Rejection
- no INVITE reaches the callee
- RE is only sending rejection messages
16Summary and Comparison of Feedback Algorithm
Parameters
Algorithm Binding Control Interval Measurement Interval Additional Parameters
Rate-abs DB TC Tm
Rate-occ ?B TC Tm fmin and ?
Win-disc DB TC Tm
Win-cont DB N/A Tm
Win-auto N/A N/A N/A
- Most algorithms have a binding parameter
- three use budget queuing delay DB
- one uses budget CPU occupancy ?B
- All three discrete time control algorithms need
Tc - Tm used by four of the five algorithms for
service rate and CPU occupancy, where applicable - Tm min(100 ms,Tc) found to be a reasonable
choice - Queue length is measured instantly
- DB budget queuing delay
- ?B CPU occupancy
- Tc discrete time feedback control interval
- Tm discrete time measurement interval for
selected server metric Tm Tc - fmin minimal acceptance fraction
- ? multiplicative factor
- DB recommended for robustness, although a fixed
binding window size can also be used Optionally
DB may be applied for corner cases
17Sensitivity of Budget Queuing Delay and Control
Interval
- Sensitivity of budget queuing delay
- Small queuing delay (lt ½ T1 timer) avoids timeout
and gives best results - Example results for win-disc
- Unit goodput when DB lt 200 ms and Tc 200 ms
- Goodput degraded by 25 DB 500 ms
- Results for win-cont and rate-abs show similar
shape, with slightly different sensitivity. - In general, a positive DB value centered at
around 200 ms sufficient for all - Sensitivity of control interval
- the smaller the Tc the better.
- Example results for win-disc,
- at D 200 ms Tc lt 200 ms sufficient to archive
unit goodput in our scenario
All load and goodput values normalized over
server capacity
18Impact of Control Interval across Algorithms
- Comparing Tc for win-disc, rate-abs and rate-occ
at DB 200ms - For both win-disc and rate-abs
- close to unit goodput except Tc 1s w/ heavy
load - win-disc more sensitive to Tc than rate-abs -gt
more busty traffic resulted from window throttle.
- shorter Tc better results (lt 200 ms sufficient)
- rate-occ not as good as the other two
- Interesting point from 14 ms to 100 ms goodput
increases in light and decreases in heavy
overload - Possible result of rate adjustment parameters
cutting the rate too much at the light overload.
Goodput vs. Tc
Goodput vs. Tc at Load 1
Goodput vs. Tc at Load 8.4
rate-occ has ?B set to 85 which is seen to
give the highest and stable performance across
different load conditions in the given scenario
19Best Performance Comparison across Algorithms
- All except rate-occ reaches unit goodput
- no retransmission ever
- server always busy processing messages
- each single message part of a successful session
- rate-occ does not operate at unit goodput
- not simply due to artificial 85 CPU limit
- inherently occupancy not as direct a metric as
needed - extremely small Tc improves performance at heavy
load but with many problems - difficulty in implementation
- actual server occupancy departs greatly from the
original intended setting - poor performance under light overload, -gt may be
linked to OCC increase and decrease heuristic
parameters.
DB (ms) Tc (ms) Tm (ms)
Rate-abs 0.2 0.2 0.1
Rate-occ1 NA 0.2 0.1
Rate-occ2 NA 0.014 0.014
Win-disc 0.2 0.2 0.1
Win-cont 0.2 NA 0.1
Win-auto NA NA NA
?B 0.85 ? 5, fmin 0.02
20Fairness for SIP Overload Control
- User-centric fairness
- In its basic form it ensures equal success rate
for each individual user - Implementation by assigning the capacity of the
overloaded server proportionally to the upstream
servers according to the original load arrival - Applicability example Third caller receives a
free gift - Provider-centric fairness
- Assuming each upstream server represents a
provider, in its basic form it ensures each
provider gets the same aggregate share of total
capacity - Implementation by dividing the capacity equally
among upstream servers - Applicability example equal-share SLA
- Customized fairness
- Any allocation as pre-specified by SLA etc.
- Deny of Service attacks, penalizing the specific
sources
21Dynamic Load Performance w/ Provider Centric
Fairness
- Realistic server to server overload situations
- more likely short periods of bulk loads
- possibly accompanied by new source arrivals or
departures. - Example result using rate-abs algorithm
- Each upstream SE share close to equal RE capacity
- Fast dynamic transition
22Dynamic Load Performance w/ User Centric Fairness
- Double feed architecture
- With load feedforward to assist receiver capacity
allocation - Example using win-cont algorithm
- Upstream SEs share to RE capacity proportional to
their offered load - Fast dynamic transition
-
23Dynamic Load Performance of win-auto Algorithm
- Source arrival transition time could be
noticeably longer - Capacity split not easy to predict
- hard to enforce explicit fairness
- basically no processing intervention
- Still achieves aggregate unit goodput
24Conclusions and Future Work
- SIP overload problem is special because of the
high rejection cost and drop retransmission - SIP overload control goal is to maximize number
of timely completed call - Approach is to have SE send only the appropriate
number of calls RE can timely handle - Presented and compared five algorithms under both
steady and dynamic load - Win-disc/win-cont/win-auto/rate-abs/rate-occ
- All but rate-occ are able to achieve unit goodput
- Algorithms binding on queue metrics is preferred
over occupancy-based heuristic - All but win-auto adapts to dynamic load and
source departure/arrival well - All but win-auto can achieve both user-centric
and provider centric fairness - Win-disc/win-cont/rate-abs requires double
feedback architecture for user-centric fairness - win-auto is still extremely simple with close to
unit steady state aggregate goodput - Future work
- More realistic network configuration including
link delay and loss, node failure model - Feedback enforcement algorithms other than
percentage throttle and window throttle