Virtual and Redundant Switches - PowerPoint PPT Presentation

About This Presentation
Title:

Virtual and Redundant Switches

Description:

Switching fabric fails, each of the switches is now disconnected from the others, ... In this case, the switching fabric switch failed, and the uplink ports were ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 18
Provided by: YUC80
Category:

less

Transcript and Presenter's Notes

Title: Virtual and Redundant Switches


1
Virtual and Redundant Switches
  • IRAM Retreat Winter 2001
  • Sam Williams

2
Outline
  • Motivation
  • Existing Products
  • Arrayed Commodity Switches
  • Adding Redundancy
  • Optimizing
  • Generalization
  • Conclusions

3
Motivation
  • Cost of switches grows very quickly
  • O(Ports2) for crossbar based
  • Additionally address tables and buffers must
    grow
  • Industry leading MTBF for a single switch is
    about 50K hours
  • and typical is perhaps only 25K.
  • Modular Switches provide redundancy for
    management and power, but not the data transport
    fabric.
  • MTTR is typically over 1 hour
  • Can the money saved by cascading commodity
    switches be applied towards improved performance
    or redundancy?
  • The goals are to improve the MTBF, improve
    performance, and simplify the work that must be
    done to replace a failed switch.

4
Existing Products
  • Existing modular aggregators can merge several
    smaller switches (modules) into a single large
    virtual switch.
  • In this case, each 36 port switch module has a
    pair of gigabit uplinks to the switching fabric,
    which has either 6 or 24 gigabit ports (full
    duplex)
  • Redundancy is also provided for management
    modules, fans, and power supplies.
  • However, not for modules or switching fabric.
  • So if the switching fabric fails, the entire
  • device fails, but if individual switching
  • modules fail, then only that sub network
  • fails.
  • Management modules can infer priority to
  • improve performance for critical activity

3com switch 4007
5
Existing Products (Analysis)
  • The cost analysis here is based on use of either
    18 or 48 Gbps switching fabrics, 36 port
    switching modules and either a 7 or 13 bay
    chassis.
  • Performance is slowdown on the time to send from
    every node to every other node compared to a true
    n36 port switch.
  • MTBF is for any part of the network
  • MTTR was at least 1 hour.
  • Repair cost is about 4000/failure
    modularization helps to keep this low, but yearly
    maintenance cost will grow with the number of
    ports

6
Examples of failure
  • Switching module fails, each of the
    nodes/sub-networks attached is no disconnected
    from all other nodes
  • More likely case
  • Switching fabric fails, each of the switches is
    now disconnected from the others, but nodes
    attached to a switch still can communicate with
    each other.

7
Examples of failure (continued)
  • Redundancy allows for this failure, with reduced
    performance.
  • This are not commodity switches, and are
    considerably more expensive.
  • However, in this case, the failure does cause a
    network split.
  • This is the more likely case, so why not allow
    the extra switch be used to cover any other
    switchs failure
  • Could be extended to nodes, but then you pay
    double for NICs and ports.

8
Virtual switch from commodity switches
  • Although without the management functions, and
    performance, cheaper virtual switches can be
    built nothing more than just cascading them
  • This is based on 5, 8, 16, and 24 port switches,
    each with the last port MDI type, and from 5
    different companies
  • Performance is poor since the uplinks are only
    100Mbps
  • Adding a second uplink port only moderately
    alleviates this deficiency

9
Virtual switch from mid-range switches
  • By using switches more suited to this design
    (higher speed uplink(s)), we can improve
    performance
  • These switches use an 8 or 24 port switch at the
    bottom, each with 1 or 2 gigabit uplink modules,
    and a 4, 8, or 12 port gigabit switch at the top
  • The gigabit uplinks and gigabit switches drive
    cost to at least twice as much as commodity
    solution, but with 10x better performance
  • Performance is near that of a monolithic switch
    if 2 uplinks are used.
  • Compared to packaged solution, its about half the
    cost, and slightly less performance, but no
    management functionality.

10
Port Virtualization for Redundancy
  • The re-mapping stage is much simpler than a full
    nm port switch. Essentially each of the m n bit
    busses are mapped to one of the k n bit internal
    busses which are connected directly to the
    switches
  • For this example each of the 4 groups of 8
    virtual ports is mapped to one of the 5 groups of
    physical ports. The uplinks of the first stage
    switches are sent back, and into one of the top
    level switches.
  • An even simpler solution, for single redundancy,
    would be to map either directly, or to the spare
  • In this design the the single point of failure is
    the re-mapping block, since first and second
    level switches have redundancy
  • So for the example below, MTBF is improved by
    about 50 (from 208 days to 347 days)

11
Operation (Homogenous switches)
  • In this somewhat rigid example, there are 6 bays,
    4 are map direct or to spare, There is a
    switching fabric slot, and a slot for the
    redundant switch, which can replace either of the
    other two classes
  • In this case, the switching fabric switch failed,
    and the uplink ports were remapped to the spare.
  • At this point the admin must replace the failed
    switch. If any other switch fails before this,
    the network will be partially split.

12
Operation - continued
  • In this case, one of the first level of switches
    failed. Instead of those nodes loosing
    connection to the rest of the network, they are
    remapped to the spare.
  • Once again, the admin must replace the failed
    switch. If any other switch fails before this,
    the network will be partially split.
  • If the case had bee the spare went down, then it
    would need to be replaced to provide redundancy.

13
Port Virtualization for Higher Performance
  • Previous performance analysis was based on
    1-to-all messaging.
  • However, it is likely that network access
    patterns can be broken into groups of high
    inter-node communication
  • Thus monitoring can be performed, and the network
    can be periodically paritioned into activity
    groups
  • Create a graph based on bandwidth used between
    nodes, use something like Kernighan partitioning
    to separate it into a number of partitions equal
    to the number of first stage switches (power of
    2).
  • The re-mapping stage is only slightly simpler
    than a full nm port switch (no buffers, never
    any contention, etc)

Logical View 3 switches reserved as spares. 1
failed, and the network was repartitioning
14
Performance / Availability
  • MTTR for aggregators was typically over an hour.
    This is on top of the time to detect the failure.
  • By automating recovery, the downtime can be
    significantly reduced
  • This is dependent on timely detection of a failed
    switch, which could be handled via packet
    injection.
  • Once the failing switch is determined, a new
    mapping can quickly be determined.
  • For the performance optimizing case, satisfying
    connectivity is the top priority, a previously
    scheduled performance can be done later.

15
Generalization
  • Use homogenous switches. There is a mapping
    layer which maps physical to virtual ports. This
    can range from simple 1 to 2, to complex 1 to n,
    with performance monitoring and repartitioning.
    Performance can be gained by using some faster
    switches where needed.

16
Conclusion
  • It is possible to make a larger virtual switch
    out of smaller switches, and still get reasonable
    performance.
  • With little additional hardware, and monitoring
    agent, it is possible to make it fault tolerant,
    with several spare switches which can be
    automatically swapped in simple case cost
    O(Spares Ports).
  • more complex designs make it O(Ports2)
  • With a very simple, but large switch, it is
    possible to also optimize for performance by
    balancing network bandwidth among switches in the
    pool. This is a much more costly solution.
  • A generalization would provide a pool of switches
    connected by the port mapper, and some or none
    reserved as spares.
  • Both of these concepts and their functionality
    could be integrated into a single ASIC or even
    using a network processor.

17
Future Work
  • How do switches fail? This determines the
    failure detection method.
  • Implementation of type 1 or 2 switch would be
    possible given the relative simple mapper.
  • Type 3, 4, or 5 would require a complex ASIC,
    which should be replaced with a network processor
    and software.
Write a Comment
User Comments (0)
About PowerShow.com