Title: Fault Isolation in Multicast Trees
1Fault Isolation in Multicast Trees
- Anoop Reddy, Ramesh Govindan and Deborah Estrin
- Sigcomm 2000. (Jan. 2000)
- 2002. 10. 17
- Presented by Heonkyu Park
- hkpark_at_cosmos.kaist.ac.kr
2Table of Content
- Introduction
- Multicast Fault Isolation
- Evaluation of Fault Isolation Approaches
- Conclusions
31. Introduction
- Fault isolation in the context of large multicast
distribution trees. - Focus is on single source trees
- Problem locating on-tree router or link which is
the origin of a fault - Fault link with significant packet loss or
origin of a router change resulting in the change
of path from source to receiver
4Introduction (Contd)
- SNMP-based monitoring systems provide a
capability that is complementary to fault
isolation - Cannot be used for fault isolation.
- Once a fault has been located, it can be used to
infer the causes of fault - Monitoring individual routers may not be
sufficient for correlating an application-perceive
d behavior router activity. - Collecting information from inside the network
may not be sufficient for fault isolation.
5Introduction (Contd)
- In this approach
- Receivers at the edge of the network periodically
probe the path to the source. - They maintain some history of the result of these
probes. - Each receiver also coordinates some set of other
receivers. - Once a fault is detected, an affected receiver
can isolate the fault using the probe history.
6Introduction (Contd)
- Probing primitive multicast traceroute
- Initiated by the receiver host sending mtrace
message to its first hop router. - That router performs two actions
- Appends its own identity and a count of total
number of packets. - Forwards the request to the previous hop towards
the source. - This router and each successive router repeat
these actions. - Finally, the router attached to the source
returns an mtrace response to the destination.
7Motivation
- Multicast traceroute (mtrace)
- Can determine both its path to the source the
number of losses on individual links in that
path. - But
- Is not sufficient for locating the origin of a
routing change. - Does not scale well to large trees.
8Multicast Traceroute
- Multicast Traceroute query from
- Rec2 is forwarded hop by hop up-to
- router R1. R1 constructs the response
- and sends it to the specified destination,
- Rec2.
(b) Naïve Approach Redundant monitoring results
when all receivers probe upto the source.
9Multicast Traceroute (Contd)
(c) Rec1 needs path information before and after
the fault from Rec2 and Rec3 and itself in order
to isolate the fault at R2.
102. Multicast Fault Isolation
- Session Watcher (W)
- Denotes the software entity that initiates the
mtrace, keep probe history, and coordinates with
other session Ws. - May be colocated with a receiver. One W for LAN.
- Naïve Technique
- Closer to the source, the overhead of mtrace
requests can be significant. - To isolate a routing change, W may need to query
every other W in the group.
112.1 Subcast-Based Fault Isolation
- In this figure, Wb only monitors
- upto its common ancestor with Wa.
- Every link is monitored by only one
- session watcher.
(b) In this example, we illustrate subcast. Wbs
query terminates at its turnaround router, R2.
The response is subcasted at R2 and is received
by Wa and Wb.
122.1 Subcast-Based Fault Isolation (contd)
- Hop limit h
- TTL
- Subcast of subtree multicast
- At turnaround router
- Overview
- How does Wb determines that Wa is tracing its
path all the way to the source? - How does Wb determines that the identity of the
common ancestor? - What happens if Wa fails?
13Fault Isolation Scenario
14Metrics
- Overhead of the mtrace path
- Maximum overhead, average overhead
- Count individual messages
- ignore message size
- Model multicast tree links as point-to-point
links. - Accuracy of fault isolation
15Response Mechanisms
- Directed multicast based scheme
- Scoped multicast based scheme
- Limited multicast based scheme
162.2 Fault Isolation Using Directed Multicast
17Directed Multicast
- Router assist mechanism
- A packet is multicast down all branches of the
subtree rooted at a specific node - Allows receivers to multicast the packet along a
specific branch. - Different characteristics from subcast based
- Significantly less Overhead
- High fault isolation error
- fault isolation error is asymmetric (same fault
with different values)
182.3 Fault Isolation Using Scoped Multicast
19Scoped Multicast
- Uses TTL-based scoping but does not require
router support - Its turnaround router should multicast the
response with a large enough response hop limit. - How does a session watcher Wa compute its
response hop limit? - Has greater overhead than subcast.
- Its fault isolation error is identical to subcast
based. - TTL-based scoping does not work with multicast
routing protocols that construct unidirectional
shared trees.
202.4 Fault Isolation Based on Limited Multicasts
21Limited Multicasts
- Consider a solution in which only a relatively
small number of session watchers multicast. - Is attractive, if its performance were
acceptable, not require router support and not
depend on multicast routing protocol
characteristics. - Its fault isolation is asymmetric.
- Can result in redundant monitoring.
- How does Wa independently determine that it needs
to multicast its response? - How many session watchers should multicast their
response? Which of them should multicast their
response? - Its fault isolation error is higher than other
schems but the overhead is likely to be higher.
222.5 Discussion
- The load on a router
- Fault isolation
- Unfortunately, mtrace may be administratively
disabled some parts of the network. - Single and multiple fault models
- The Internet multicast routing infrastructure
continues to evolve. - Route change can only pinpoint an on-tree router
responsible for the change.
233. Evaluation of fault isolation approaches
- 3.1 Analytic Evaluation
- Given n-ary trees with depth d session watchers
located at the leaves of the tree. - Study the variation of these metrics as a
function of the number of session watchers. - Non-lossy situation / mtrace or response loss
24Methodology and Assumptions
- All non-leaf nodes have the same fanout
- All leaves are at the same distance from the
source
25Results
26Results Maximum/Average Overhead
27Results Fault Isolation Error
283.2 Impact of Tree Characteristics on Performance
- Explore the performance of these schemes a
variety of irregular trees - To study the impact of irregular fanout and
non-uniform path lengths between session watchers
and the source, this paper use the random tree
generator. - Average the performance measures over ten
randomly generated trees.
29Maximum Overhead
30Average Overhead
31Average Fault Isolation Error
323.3 Summary of Performance Evaluation
- Deployable alternative limited-multicast for
small groups - Deployable solution scoped-multicast performs
well on average.
334. Conclusions
- Since mtrace will not be widely supported in
routers, so the purposed approach may not be used
directly. - Subcast directed multicast are not widely
deployed either. - If we could use other approach get the same
information as mtrace, then the approach in this
paper is heuristic.