Title: Y'1711 FECCV Tutorial
1Y.1711 FEC-CV Tutorial
- Dave Allan, Sr. Advisor MPLS Standards
- March 2003
2Presentation Roadmap
- Introduction
- Problem Space
- Solution Overview
- Solution Specifics
- Standards Status
3Introduction
- Current verification techniques for FEC/FTN
configuration or MPLS/LDP networks are very
control processor intensive. An easily
implemented solution that can be processed at
line rate and has minimal non-LSP dependencies is
desirable. - Verification involves auditing forwarding
policy/configuration between an egress and the
set of ingresses for an LSP. Both ends need to
agree on the raison d'etre for the LSP, or there
has been either a failure or implementation
problem. - Verification can be performed by periodically
comparing the set of FECs associated with the LSP
by the ingress with the set of FECs advertised by
the egress. Ingress set should be perfect subset
of the egress. - Use of "knowledge filters" permits set of FECs
for an LSP to be encoded in a fixed length,
unstructured bit vector. Simple comparison of bit
vectors generated by the ingress and egress
identifies mismatch conditions. - This allows FTN verification to be added to
Y.1711 monitoring functions without impacting
Y.1711 availability metrics. Further, FEC
verification permits the possibility of the
egress to dynamically associate specific ingress
points with FECs/LSP IDs etc. in order to perform
availability measurement.
4Problem Space
5The Generic Problem Space
Configuration
Forwarding
Users
Originator
- Originator passes configuration information to
users - Some type of traffic mapping information "what
to do with this forwarding path" - Users employ configuration information to map
traffic to forwarding path - Set of configuration information offered by
originator can incrementally change - route flaps, administrative additions or
deletions etc. - changes have finite propagation time
- Set of configuration information employed by
users may only be subset of that offered by
originator - may be pruned locally or by intermediate systems
- Originator may not have knowledge of the set of
users - Users dont necessarily know who originator is
- How do you verify that users are correctly using
the path between the user and the originator? - Simplest form, echo the configuration being used,
Im using this path to reach this FEC
6FTN Verification
- The FTN is the FEC-NHLFE table in an LSR.
- FEC forwarding equivalence class
- NHLFE next hop label forwarding entry
- How packets get mapped to LSPs
- Currently FTN may be configured, or populated via
LDP protocol exchange. - In the LDP DU scenario, FTN info flows from
egress to ingress - FTN Verification adds comprehensive
misbranching/misconfiguration detection to Y.1711 - Can detect (Any-LSP) leaking into ER-LSP
- Can detect (Any-LSP) leaking into LDP-LSP lt--NEW
to Y.1711 - Can add verification of LSP FTN configuration.
lt--NEW to Y.1711 - FTN Verification permits plug and play
availability to common MPLS applications - VPN, PW, BGP, any time the transport LSP FEC is
the egress node I.D. - FTN Verification detects problems traditional
CNLS networks mechanisms cannot detect
7Sidebar Misbranching
- Misbranching/FTN misconfiguration can manifest
itself in a number of forms - 1) An LSP can be cross connected onto a label
value that is not in use. This will result in the
traffic being discarded at the next LSR as there
will be no incoming label map (ILM) for the
packet. - 2) An LSP can be cross connected onto a label
that is being used. This will result in traffic
being delivered to the wrong egress point in the
network. - 3) An intermediate LSR can propagate an incorrect
FEC/label binding upstream. This will result in
traffic being delivered to the wrong egress point
in the network. - 4) An LSR can assign an incorrect FEC/FTN entry
to a label at the ingress. This will result in
traffic being delivered to the wrong egress point
in the network. - 5) Misbranching can also occur vertically in the
MPLS hierarchy due to mis-configuration of LDP,
as an artifact of independent label distribution,
or due to other stack handling problems. - Unlike IP where a mis-forwarding of a packet will
be corrected by the next hop router, in MPLS a
mis-forwarded packet can emerge just about
anywhere in the MPLS network.
8Relevant LDP specifics
- LDP has probably the most demanding
characteristics. - Connectivity is defined in terms of FECs not
LSPs. - LSP ingress used for an FEC by an LSR can change
- LSP that traffic arrives on at egress for an FEC
for a given ingress can change - FEC is the only constant
- Multiple FEC elements may be associated with an
LDP label advertisement - Each FEC element is a prefix (address range or
host route) - FTN verification corresponds to MPLS scalability
(unlike ICMP PING!) - Set of advertised FECs with an LSP next hop may
vary. - E.g. LSR 'A' advertises to 'B' FECs XYZ
- LSR 'B' advertises to 'C' FECs XY
- LSR 'C' advertises to 'D' FEC X etc.
- What 'D' would want to test is a subset of what
'A' advertised. - Multiple dimensions to label distribution/retentio
n - Ordered/independent
9Solution Overview
10"Do simple things all the time, or complex
things some of the time"
- Monitoring and measurement is "All the Time"
- Simple audit of sets of FECs between ingress and
egress is non-trivial. - Echoing back on the data plane what LDP sent out
just replicates LDP. - Complex implementation, ultimately need to
consider DOS attacks, and impact on overall
network operation. - Results degrade when the network is under stress
- Proactive Monitoring/Measurement requires a
simpler verification test than a simple brute
force audit
11The FEC-CV Solution
- Use "knowledge digest" techniques to encode FECs
into a simple fixed length token at both ingress
and egress. - Transport ingress token to egress in monitoring
probes for verification. - Comparison of tokens easily implemented
- "Stateless" comparison, works for LDP/MP2P/ECMP
etc. - Is whomever is using this LSP, doing it for a
valid FEC? - "Knowledge digest" has desirable property of
permitting set 'A' to be verified as a subset of
set 'B' in a single operation. - Key to handling all aspects of LDP FEC/label
distribution. - NOTHING IS FREE Technique introduce small
statistical uncertainty into detecting FTN
problems - Techniques exist to minimize statistical
uncertainty - Uncertainty is proportional to
- Number of FECs encoded into the array.
- Size of the array
- Number of encoding schemes (hashes) employed
simultaneously. - Can be engineered to 99.9999 reliability.
- The good news is that the uncertainty is confined
to not detecting errors, rather than "crying
wolf".
12Sidebar Bloom Filter
- Concept is fairly simple
- Bit array is used to represent a set of
information tokens - Each information token in the set is fed through
multiple hash functions, output is a bit position
to set in the array. - When testing set membership, finding ALL the
required bits set in the filter indicates token
is probably present. - Any bits not set is absolute authoritative
indication the token is NOT in the set.. - Accuracy is in proportion to the array length,
the number of tokens represented and the number
of bits used per token
13Bloom Filter Example
- 16 bit filter, uses two bits per FEC for
encoding. -
- FILTER VALUES
- "FEC A" hashes to bits 2 and 9 0000001000000100
- "FEC B" hashes to bits 11 and 15 100010000000000
0 - "FEC C" hashes to bits 2 and 13 0010000000000100
- "Set ABC" A OR B OR C 1010101000000100
- "FEC D" hashes to bits 11 and 14 010010000000000
0 - Is "FEC D" in "Set ABC"?
- "FEC D" 0100100000000000
- AND NOT "Set ABC" 0101010111111011
- is non zero 0100000000000000
- So "FEC D" is absolutely NOT in "Set ABC"
- "FEC E" hashes to bits 9 and 13 0010001000000000
- AND NOT "Set ABC" 0101010111111011
- is zero, "FEC E" generates a "false
positive" 0000000000000000
14Filter Engineering
- Probability of error (false positive)
- Perr (1-(1-1/m)kn)pq
- Where
- m filter size
- k number of hash functions used by egress
- n number of elements in egress set
- p number of hash functions used by ingress
- q number of elements in ingress set
- NOTE Filter engineering equation assumes perfect
randomization of the dataset by each of the hash
algorithms. - Probability of error converges on being a
function of the number of bits per entry.
Therefore for a given desired probability of
error, a filter size can be determined. - Profiling of practical implementations of
randomization algorithms shows we can expect
about 20 deviation from the ideal.
We distinguish ingress and egress because optimal
inter/intra application performance requires
different coding rules per application
15Example Filter size 128 bitCommon encoding
rules
Note how past a certain point using too many hash
algorithms actually degrades filter performance.
- 99.25 reliable _at_ 12 FECs/LSP - 99.92
reliable _at_ 7 FECs/LSP - 99.9936 reliable _at_ 4
FECs/LSP - 99.9998 reliable _at_ 2 FEC/LSP -
99.99999 for one.
16Typical Operation
1 - Egress LSR decides to advertise FEC.
Configures I/F OAM modules with Bloom Filter
encoding of FECs for the LSP. (Could be OAM
extension to ILM). 2 - FECs advertised to
neighbors with LDP 3 - FEC advertisement (or some
subset) arrives at ingress LSR. 4 - LSR chooses
to use FEC advertisement and configures I/F with
Bloom Filter for FEC. 5 - I/F OAM module sends
periodic FEC-CV with Bloom Filter. 6 - Egress LSR
verifies FEC-CV filter is subset of filter
programmed in step 1. If mismatch, notify CP.
(Other information in probe sufficient to
initiate tracking down precise problem)
17Solution Specifics
18FEC-CV PDU Format
Delta from basic Y.1711 CV
LSP Trail Termination Source
Bloom
BIP16
Function
Padding
Padding
filter
Identifier (TTSI), 16 octet v6 LSR
Type
(00Hex)
(00Hex)
ID
(07Hex)
4 octet access point ID
1 octet
3 octets
20 octets
16 octets
2 octets
2 octets
- Current proposed filter length is 128 bits
- Fields line up on 32 bit boundaries.
- OAM alert label (14) distinguishes OAM PDUs from
ANY payload. - PDU is padded to minimum Ethernet frame size
when prefixed with transport, and OAM alert
labels
19FEC CV Filter Encoding/1
- Encoding is specific to the application
- Filter entry application specific number of
filter offsets - Filter offset bit position in the filter
- LDP uses 3 filter offsets to encode an FEC
element as a filter entry - RFC 2547 VPN uses 5 filter offsets to encode the
BGP next hop, RD and RT as a filter entry
20FEC CV Filter Encoding/2
- CR-LDP uses 3 filter offsets to encode FEC
elements as a filter entry
21FEC CV Filter Encoding/3
- RSVP TE uses 4 filter offsets to encode a filter
entry - For make before break operations the egress
will encode both the old and new LSP I.D.s in the
filter so that the changeover does not result in
any false alarms. Receiver can promiscuously
accept FEC-CVs encoding either LSP ID during
changeover. - When pathtear for the old LSP-ID occurs, egress
can update filter accordingly.
22Computing Filter Offsets
- FEC-CV uses a specific CRC32 polynomial,
optimized to maximize hamming distance for
smaller data tokens - G(x) x32x30x28x21x19x15x12x9x8x4x3x2
x1 - Mask and fold the computed CRC to produce the
requisite number of 7 bit filter offsets. For
example, the masking and folding to generate 3
filter offsets is -
23Filter processing
- Goal is to
- distinguish synchronization artifacts from
genuine problems - Artifacts are generated by
- Adding a FEC to the set of FECs associated with
an LSP - Removing a FEC from the set of FECs associated
with an LSP - Artifacts exist due to delays in control plane
propagation across the network - Provide mechanisms to tolerate ingress or
intermediate LSR filtering - The set of FECs that the ingress instantiates may
only be a subset of that offered by the egress. - Ingress may apply filtering to the FECs
- Intermediate node may only relay a subset of the
FECs received ( for example independent mode
label distribution) - Adding a FEC is not problem. Egress filter will
be superset of ingress. - Removing a FEC requires keeping a copy of the
pre-FEC-removal filter around for some amount
of time. The pre-FEC- removal filter is termed
the cumulative filter (you may also get adds in
conjunction with removals).
24Filter processing
- Application specific, either acceptable subset
or exact match
Acceptable subset assumes either ingress LSR or
intermediate LSRs have modified the set of
Information offered by the egress
Exact match assumes that neither the ingress LSR
nor intermediate LSRs will modify FEC
information.
25Additional processing
- When a dFECfilter_Mismatch or dFECfilter_Mismerge
is detected - TTL matters (or at least can facilitate
minimizing MTTR), the higher the TTL, the closer
the probe source is to the defect.
26Probe Frequency
- Arrival rate of FEC-CV at the egress is a
function of the number of ingresses. - 100s of PEs can hit on a single egress.
-
- Ability to detect problems and recover from
problems without state machines flapping - In particular leakage between LSPs
- Optimal rate is a function of the number of
ingresses and the network diameter. - Not flapping is a probability game, what rate is
the most frugal of resources, and provides timely
detection of real problems while minimizing false
alarms - Current proposal is a configured network wide
value in the range 1 sec. to 1 minute.
27Plug n Play LDP Availability
- FEC-CV probe contains sufficient information for
probes to self register for connectivity
tracking at the egress. - Egress needs to be unique for FEC, ingress needs
to use liberal label retention - Augments link and node fault detection to provide
authoritative coverage - Procedure
- Egress offers FEC/label bindings
- Ingress receives binding, encodes FECs into
FEC-CV engine. - Ingress starts inserting FEC-CV probes into LSPs
- Egress receives FEC-CV probe,
- verifies FEC filter is OK,
- Checks if TTSI is known
- No, instantiate state for tracking
- Yes, update availability state machine.
- Egress periodically reviews who is sending FEC-CV
- Flags gaps.
- Recovers state in nodes deemed to have failed.
- Changes in egress connectivity can be correlated
between interfaces - Arrival interface simply moved.
28Plug n Play Issues
- Egress scalability
- Implementations will have practical limits to the
number of peers they can track - Issue is common to all proactive detection
mechanisms - Alarm suppression Alarm generation needs to be
coordinated with the IGP. Otherwise any link or
node failure will result in all peers declaring
an alarm. - If the originating LSR fails, or originating
numbered link fails and this is IGP flooded, then
the LSR detecting the connectivity problem should
not generate an alarm for any LSPs originating
with the failed link. - LSR I.D. in TTSI provides sufficient information
to do this. - If unnumbered links are used, situation is more
ambiguous, a link associated with an LSR I.D. has
failed, and some LSPs that have the same LSR I.D.
will be noted to have failed. - Alarm suppression for inter-area is FFS.
- Implies that implementations should only track
availability for nodes in the same area.
29Next Steps
- Devise a de-registering mechanism to support
availability measurement with conservative label
retention and permit other administrative
actions. - Alarm suppression for inter-area LSPs.
30Conclusions/Summary
- FEC-CV generation is always on
- Ingress implementation is trivial
- Egress determines degree of processing employed
- Ignore messages / detect misbranching / measure
availability - FEC-CV works in two scenarios
- Simple misbranching detection
- Use of FEC-CV by ingresses provides
misbranching/mismerging detection across the
MPLS architecture. - Closes a gap in MPLS fault detection
- Availability measurement
- Provides simple audit that the network is
functioning as a system, routing, signalling and
forwarding all have to be working.
31Acronyms
- BGP - border gateway protocol
- CP - control processor
- CV - connectivity verification probe
- ER - explicitly routed
- FEC - forwarding equivalence class
- FMO - future mode of operation
- FTN - FEC to NHLFE table
- ILM - Incoming label map
- LDP - label distribution protocol
- LSP - label switched path
- LSR - label switch router
- MP2P - multi-point to point
- NHLFE - next hop label forwarding entry
- PMO - present mode of operation
- P2P - point to point
- TTSI - trail termination source identifier
32References
- Bloom, B., "Space/Time Trade Offs in Hash Coding
with Allowable Errors", Communications of the
ACM, Vol. 13, No. 7, July 1970 - Ripeanu, M., Iamnitchi, A., "Bloom Filters
Short Tutorial", www.cs.uchicago.edu/matei/PAPER
S/bf.doc - Andersson et. al., "LDP Specification", IETF RFC
3036, January 2001 - Awduche, D., et.al., "RSVP-TE Extensions to RSVP
for LSP Tunnels", IETF RFC 3209, Decmeber 2001 - Rosen, E., et. al., "BGP/MPLS VPNs", IETF RFC
2547, March 1999 - Jamoussi, B. et.al., "Constraint Based LSP Setup
using LDP", IETF RFC 3212, January 2002 - Rekhter, Tappan, Sangli, "BGP Extended
Communities Attribute", May 2002,
draft-ietf-idr-bgp-ext-communities-05.txt