ACTIVE RELIABLE MULTICAST HOW IT WORKS, HOW IT CAN BE USED ON COMPUTATIONAL GRIDS - PowerPoint PPT Presentation

About This Presentation
Title:

ACTIVE RELIABLE MULTICAST HOW IT WORKS, HOW IT CAN BE USED ON COMPUTATIONAL GRIDS

Description:

ACTIVE RELIABLE MULTICAST HOW IT WORKS, HOW IT CAN BE USED ON COMPUTATIONAL GRIDS Congduc PHAM SUN's – PowerPoint PPT presentation

Number of Views:120
Avg rating:3.0/5.0
Slides: 59
Provided by: Pham93
Category:

less

Transcript and Presenter's Notes

Title: ACTIVE RELIABLE MULTICAST HOW IT WORKS, HOW IT CAN BE USED ON COMPUTATIONAL GRIDS


1
ACTIVE RELIABLE MULTICAST HOW IT WORKS, HOW IT
CAN BE USED ON COMPUTATIONAL GRIDS
  • Congduc PHAM
  • SUN's "Gourmandise Cérébrale"
  • SUN Labs Europe,
  • Thursday, February 14th, 2002

http//www.ens-lyon.fr/LIP/RESAM
2
Outline
  • Introduction
  • How it works
  • How it can be used on computational grids

3
multicast!
multicast!
Everybody's talking about multicast! Really
annoying ! Why would I need multicast for by the
way?
multicast!
multicast!
multicast!
multicast!
multicast!
multicast!
multicast!
multicast!
multicast!
multicast!
alone
multicast!
multicast!
multicast!
4
Challenges for the Internet
Think about
  • high-speed www
  • video-conferencing
  • video-on-demand
  • interactive TV programs
  • remote archival systems
  • tele-medecine, white board
  • high-performance computing, grids
  • virtual reality, immersion systems
  • distributed interactive simulations/gaming

5
From unicast
Sender
  • Problem
  • Sending same data to many receivers via unicast
    is inefficient
  • Example
  • Popular WWW sites become serious bottlenecks

data
data
data
data
data
data
Receiver
Receiver
Receiver
6
to multicast on the Internet.
Sender
  • Not n-unicast from the sender perspective
  • Efficient one to many data distribution
  • Towards low latence, high bandwidth

data
data
data
data
Receiver
Receiver
Receiver
7
User perspective of the Internet
from UREC, http//www.urec.fr
8
What it is in reality
from UREC, http//www.urec.fr
9
Links the basic element in networks
  • Backbone links
  • optical fibers
  • 10 to 160 GBits/s with DWDM techniques
  • End-user access
  • V.90 56Kbits/s modem on twisted pair
  • 512Kbits/s to 2Mbits/s with xDSL modem
  • 1Mbits/s to 10Mbits/s Cable-modem
  • 64Kbits/s to 1930Kbits/s ISDN access
  • 9.6Kbits/s (GSM) to 2Mbits/s (UMTS)
  • 155Mbits/s to 1Gbits/s SDH

10
Routers key elements of internetworking
  • Routers
  • run routing protocols and build routing table,
  • receive data packets and perform relaying,
  • may have to consider Quality of Service
    constraints for scheduling packets,
  • are highly optimized for packet forwarding
    functions.

11
The Wild Wild Web
heterogeneity, link failures, congested
routers packet loss, packet drop, bit errors
important data
?
12
Multicast difficulties
  • At the routing level
  • management of the group address (IGMP)
  • dynamic nature of the group membership
  • construction of the multicast tree (DVMRP, PIM,
    CBT)
  • multicast packet forwarding
  • At the transport level
  • reliability, loss recovery strategies
  • flow control
  • congestion avoidance

13
Reliable multicast
  • What is the problem of loss recovery?
  • feedback (ACK or NACK) implosion
  • replies/repairs duplications
  • difficult adaptability to dynamic membership
    changes
  • Design goals
  • reduces recovery latencies
  • reduces the feedback traffic
  • improves recovery isolation

14
Active Reliable Multicast
How does it work?
15
What is active networking?
  • Programmable nodes/routers
  • Customized computations on packets
  • Standardized execution environment and
    programming interface
  • No killer applications, only a different way to
    offer high-value services, in an elegant manner
  • However, adds extra processing cost

16
Motivations behind active networking
  • user applications can implement, and deploy
    customized services and protocols
  • specific data filtering criteria (DIS, HLA)
  • fast collective and gather operations
  • globally better performances by reducing the
    amount of traffic
  • high throughput
  • low end-to-end latency

17
Active networks implementations
  • Discrete approach (operator's approach)
  • Adds dynamic deployment features in nodes/routers
  • New services can be downloaded into router's
    kernel
  • Integrated approach
  • Adds executable code to data packets
  • Capsule data code
  • Granularity set to the packets

18
The discrete approach
  • Separates the injection of programs from the
    processing of packets

19
The integrated approach
  • User packets carry code to be applied on the data
    part of the packet
  • High flexibility to define new services

data
20
An active router
some layer for executing code. Let's call it
Active Layer
21
Solutions for Reliable Multicast
  • Traditional
  • end-to-end retransmission schemes
  • scoped retransmission with the TTL fields
  • receiver-based local NACK suppression
  • Active contributions
  • cache of data to allow local recoveries
  • feedback aggregation
  • subcast

22
A step toward active services LBRM
23
Active local recovery
  • routers perform cache of data packets
  • repair packets are sent by routers, when
    available

data
data
data5
data1
data2
data1
data3
data2
data4
data3
data5
data4
data5
data4
data1
data2
data3
data5
24
Global NACKs suppression
25
Local NACKs suppression
26
Active subcast features
  • Send repair packet only to the relevant set of
    receivers

27
Active Reliable Multicast
How can it be used?
Computational grids The DyRAM framework Some
simulation results Conclusions and perspectives
GRID?
28
What is a computational grid?
application user
from Dorian Arnold Netsolve Happenings
29
Some grid applications
Astrophysics Black holes, neutron stars,
supernovae
Mechanics Fluid dynamic, CAD, simulation.
Distributed interactive simulations DIS,
HLA,Training.
Chemistrybiology Molecular simulations,
Genomic simulations.
30
Reliable multicast a big win for grids
Data replications Code data transfers,
interactive job submissions Data communications
for distributed applications (collective gather
operations, sync. barrier) Databases, directories
services
SDSC IBM SP 1024 procs 5x12x17 1020
224.2.0.1
NCSA Origin Array 256128128 5x12x(422) 480
CPlant cluster 256 nodes
Multicast address group 224.2.0.1
31
From reliable multicast to Nobel prize!
OK! Resource Estimator Says need 5TB, 2TF. Where
can I do this?
Resource Broker LANL is best match but down for
the moment
From President_at_earth.org Congratulations, you
have done a great job, it's the discovery of the
century!! The phenomenon was short but we manage
to react quickly. This would have not been
possible without efficient multicast facilities
to enable quick reaction and fast distribution of
data. Nobel Prize is on the way -)
Resource Broker 7 sites OK, but need to send
data fast
32
Multicast communications on grids
  • Dynamic groups are very difficult to handle with
    the reliability constraint
  • Mixture of high-throughput (data replication) and
    low latencies (distributed applications) needs
  • The application under consideration can have a
    great impact on the protocol design (i.e. local
    recoveries)
  • A one protocol-fits-all solution is difficult!

33
The DyRAM framework (M. Maimour)
  • Receiver-based use of NACKs.
  • No cache in routers, receivers perform local
    recoveries
  • which are based on a tree structure constructed
    on a per-packet basis.
  • Routers play an active role.
  • Low-overhead active services
  • Focus on low latency
  • Load balancing features

34
where to put active components?
1000 Base FX
active router
active router
core network Gbits rate
Server
active router
active router
active router
100 Base TX
35
Related works on local recovery
  • SRM
  • any receiver in the neighborhood
  • RMTP, TMTP, LMS, PGM, TRAM
  • a designated receiver
  • LBRM
  • a logging server

36
Active services in DyRAM
  • Designed to provide low latencies
  • Session initialization
  • Early packet loss detection
  • NACK aggregation
  • Subcast of repair packets
  • Dynamic replier election

37
DyRAM and IP multicast
  • Relies on IP multicast but has few interactions
  • Runs its own simple session protocol to gather
    additional topological information at the DyRAM
    level to enhance the group anonymity imposed by
    IP multicast

38
DyRAM session initialization
D0
DyRAM
0
2
1
D1
DyRAM
R1
1
0
R2
R3
R4
R6
R5
R7
39
How and where losses can occur
  • Packet losses occur mainly in edge routers
  • In this case, all downstream links would most
    likely be affected by a packet loss
  • On medium speed LAN, when a packet has been sent
    on the wire all computers will usually be able to
    receive it
  • On very high-speed LAN, computers can be the
    bottleneck

40
DyRAM early packet loss detection
  • The repair latency can be reduced if the lost
    packet could be requested as soon as possible
  • DyRAM realizes this functionality by enabling
    some routers to detect losses and therefore to
    generate NACKs towards the source
  • This loss detection service should be located
    near the source, but not too near!

41
DyRAM replier election
  • A receiver is elected to be a replier for each
    lost packet
  • Several recovery trees at a given time
  • Load balancing can be taken into account, several
    optimizations possible
  • Uses the topological information gathered during
    the session initialization

42
DyRAM replier election
_at_R1,vif 1 _at_R2,vif 2 _at_R3,vif 2 _at_R4,vif 2 _at_D1,vif 0
D0
DyRAM
0
2
1
D1
_at_R5,vif 1 _at_R6,vif 1 _at_R7,vif 0
DyRAM
Repair 2
R1
1
0
R2
R3
R4
R6
R5
R7
43
DyRAM subcasting
  • Tries to solve the exposure problem
  • Using the NACK pattern to select relevant links
    can not avoid exposure
  • Use of IP addresses is more costly but allows for
    an exact matching
  • Several optimizations possible, including a
    dynamic selection of the appropriate mechanism

44
Routers soft state
  • The NACK State (NS) structure which maintains for
    each lost packet,
  • seq the sequence number of the requested
    packet.
  • rank the number of NACK received.
  • subList List of the links from which similar
    NACKs arrived (or IP addresses).

45
Routers soft state (cont.)
  • The Track List (TL) structure which maintains for
    each multicast session,
  • lastOrdered the sequence number of the last
    received packet in order
  • lastReceived the sequence number of the last
    received data packet
  • lostList a bit vector that keeps track of
    received packet
  • Reduces the replier election delay.

46
DyRAM overview
One benefit of active networking is to unload the
source from heavy retransmission overheads.
The backbone is fast, very fast (DWDM, 10Gbits/s
not uncommun), so nothing else than fast
forwarding functions.
The active router associated to the source can
perform early processing on packets. For instance
our DyRAM protocol uses subcast and loss
detection facilities in order to reduce the
end-to-end latency.
source
Any receiver can be designated as a replier for a
loss packet.The election is performed by the
associated upstream active router on a per-packet
basis. Therefore several loss recovery trees can
co-exist in parallel at a given time.
100 Base TX
active router
active router
core network Gbits rate
active router
A hierarchy of active routers can be used for
processing specific functions at different layers
of the hierarchy. For instance, having an active
router at the nearest location from the
source/destination could performs very efficient
NACK packets suppression
active router
1000 Base FX
active router
DyRAM can increases performances by associating a
dedicated active router to a pool of computing
resources.
47
Some simulation results
  • Network model and used metrics
  • Local recovery from the receivers
  • DyRAM vs. ARM
  • DyRAM combined with cache at routers

48
Network model
10 MBytes file transfer
49
Metrics
  • Load at the source the number of the
    retransmissions from the source.
  • Load at the network the consumed bandwidth.
  • Completion time per packet (latency).

50
Local recovery from the receivers (1)
4 receivers/group
  • Local recoveries reduces the load at the source
    (especially for high loss rates and a large
    number of the receivers).

p0.25
grp 624
51
Local recovery from the receivers (2)
  • As the groups size increases, doing the
    recoveries from the receivers greatly reduces the
    bandwidth consumption

48 receivers distributed in g groups ? grp 224
52
Local recovery from the receivers (3)
4 receivers/group
  • Local recoveries reduces the end-to-end delay
    (per packet)

grp 624
p0.25
53
DyRAM vs ARM
  • ARM performs better than DyRAM only for very low
    loss rates and with considerable caching
    requirements

54
DyRAM with cache at the routers (1)
  • When DyRAM benefits from the cache at the routers
    in addition to the recovery from the receivers,
    it always performs better than ARM.

ARM without cache
p0.25
55
DyRAM with cache at the routers (2)
  • When DyRAM benefits from the cache at the routers
    in addition to the recovery from the receivers,
    it always performs better than ARM.

ARM without cache
p0.25
56
DyRAM early loss detection
grp 624
4 receivers/group
p0.25
p0.5
grp 624
57
Conclusions
  • Reliability on large-scale multicast session in
    difficult. Active services can provide efficient
    solutions for avoiding implosion and exposure.
  • The main design goals for DyRAM is to reduce the
    end-to-end delays (recovery for instance) to
    enable large distributed applications on
    computational grids.

58
References
  • D. L. Tennehouse, J. M. Smith, W. D. Sincoskie,
    D. J. Wetherall, and G. J. Winden. A survey of
    active network research. IEEE Communications
    Magazine, pages 80--86, January 1997.
  • L. Wei, H. Lehman, S. J. Garland, and D. L.
    Tennenhouse. Active reliable multicast. IEEE
    INFOCOM'98, March 1998.
  • M. Maimour, C. Pham. A Throughput Analysis of
    Reliable Multicast Protocols in an Active
    Networking Environment. IEEE ISCC'2001, Hammanet,
    Tunisia.
Write a Comment
User Comments (0)
About PowerShow.com