Long Fat Pipe Congestion Control for MultiStream Data Transfer PowerPoint PPT Presentation

presentation player overlay
1 / 27
About This Presentation
Transcript and Presenter's Notes

Title: Long Fat Pipe Congestion Control for MultiStream Data Transfer


1
Long Fat Pipe Congestion Control for Multi-Stream
Data Transfer
  • M. Nakamura1, H. Kamezawa1, J. Tamatsukuri1, M.
    Inaba1, K. Hiraki1, K. Mizuguchi2, K. Torii2, S.
    Nakano2, S.Yoshita2, R. Kurusu2, M. Sakamoto2, Y.
    Furukawa2, T. Yanagisawa2, Y. Ikuta2, J.
    Shitami3, A. Zinzaki3
  • 1)Univ. of Tokyo, 2)Fujitsu Computer Techs,
    3)Fujitsu Labs

2
Outline of this talk
  • Background of Data Reservoir project
  • Observations at BWC2002
  • Modifications to TCP congestion control
  • Transmission Rate Controlled TCP for DR
  • IPG tuning
  • Cooperative parallel streams for DR
  • Dulling Edges of streams
  • Results at BWC2003

3
Objectives of Data Reservoir
  • Sharing scientific data between distant research
    institutes
  • Physics, astronomy, earth science, simulation
    data
  • Very high-speed single file transfer on Long Fat
    pipe Network (LFN)
  • High utilization of available bandwidth
  • OS and filesystem transparency
  • Storage level data sharing
  • High speed iSCSI protocol on TCP

4
Features of Data Reservoir
  • Data sharing in low-level protocol
  • Use of iSCSI protocol
  • Efficient disk to disk data transfer
  • Multi-level striping for performance scalability
  • Local file accesses through LAN
  • Global disk transfer through WAN
  • Unified by iSCSI protocol

5
File accesses on Data Reservoir
6
Global disk transfer on Data Reservoir
7
Observations at BWC2002
8
Results of SC2002 BWC
  • 550 Mbps, 91 utilization
  • Bottleneck OC-12, RTT 200 ms
  • Parallel normal TCP streams
  • 24 nodes x 2 streams
  • Most Efficient Use of Available Bandwidth award

9
Observations of SC2002 BWC
  • But
  • Poor performance per stream
  • Packet loss hits a stream too early during slow
    start
  • TCP congestion control recovers window too slowly
  • Unbalance among parallel streams
  • Packet loss occurs asynchronously unfairly
  • Slow streams cant catch up fast streams

10
Transmission rate affects performance
  • Transmission rate is important
  • Fast Ethernet gt GbE
  • Fast Ethernet is ultra stable
  • GbE is too unstable and poor on average
  • Iperf
  • 30 seconds
  • Bottleneck
  • 596Mbps
  • Background traffic
  • 10
  • 200ms RTT

11
Analyzing microscopic behavior
  • DR Giga Analyzer
  • 1 scatter, 8 storage machines
  • Scatter Comet i-NIC
  • 100ns resolution,GPS timestamp
  • GbE bi-direction
  • Full packets, full data logging in 2 hours

12
Slow start makes burst
  • Slow start
  • Double window of data every RTT
  • Send whole window burstly at the beginning of
    every RTT
  • Packet loss occurs even though huge idle period
  • Packets sent in 20 ms, nothing happen in 180 ms

Packet loss occurred
13
What resolves these problems?
  • Modifications to TCP congestion control
  • Packet spacing
  • Reducing aggressiveness of bursty behavior of TCP
  • Transmission Rate Controlled TCP (TRC-TCP)
  • Cooperative parallel streams
  • Sharing congestion window among streams
  • Balancing throughputs of streams
  • Dulling Edges of Cooperative Parallel Streams

14
Transmission Rate Controlled TCP
  • Ideal Story
  • Transmitting a packet
  • every RTT/cwnd
  • 24 us interval for 500Mbps
  • MTU 1500B
  • High load for software only

15
IPG tuning
  • Inter Packet Gap (IPG) of Ethernet MAC layer
  • A time gap between packets
  • 81023B, 1B (8ns) step in case of Intel e1000
  • TCP stream
  • 941 Mbps 567 Mbps
  • Fine grain, low jitter, low overhead

IPG
IPG
IPG
Extending IPG
IPG
IPG
16
IPG tuning on GbE
  • Bottleneck is 596 Mbps
  • 200ms RTT
  • 10 background traffic
  • Improve in Max/Avg case using IPG 1023B
  • Transmission Rate lt Bottleneck bandwidth
  • Improve in Max case using IPG?512B
  • No effect in Min case

17
Dulling Edges of Cooperative Parallel Streams
  • External scheduler in userland
  • Balancing throughputs of streams
  • Clients/server among multiple hosts
  • Clients send own cwnd to server
  • Recorded in Scoreboard
  • Server decides notifies goal cwnd
  • Decelerating faster streams
  • Modification to TCP
  • Adding setsockopt() that bounds effectively cwnd
    when transmitting packets

18
Bounding effective cwnd
  • setsockopt(umax, umix)
  • eff cwnd umax if cwnd ? umax
  • eff cwnd cwnd if umin lt cwnd lt umax
  • eff cwnd umin if cwnd ? umin
  • when cwnd umin, cwnd enlarged too rapidly
  • Bursty emission of packets occurred
  • Bounding only upper limit of cwnd
  • umax goal cwnd of external scheduler

19
External scheduler of DECP
  • Clients sending cwnd periodically
  • e.g. every second
  • Server updating scoreboard S (in Nm )
  • Sn1 ?Sn ?d(cwndi/Q)
  • ? decreasing parameter against previous value
  • d(x) convex function around x. e.g.
    (...,0,0,1,2,3,2,1,0,0,...)
  • Q quantization parameter
  • goal cwnd (s a)Q
  • s index of the most popular si
  • a slack parameter

20
Results at BWC2003
21
24,000km(15,000miles)
OC-48 x 3 GbE x 1
OC-192
15,680km (9,800miles)
8,320km (5,200miles)
Juniper T320
22
Network configuration (During SC2003)
RTT 292 ms
Abilene
UoTokyo (GSR)
(GSR)
MANLAN
Aggregated BW 8.2 Gbps
KSCY (T640)
RTT 335 ms
LOSA (T640)
APAN (M20)
(T320)
RTT 326 ms
SCinet
(BI8k)
(T640)
(E1200)
(BI8k)
OC-192
OC-48
Loopback
GbE
  • IBM x345
  • w/ Intel GbE NIC
  • Linux

10GbE
23
SC2003 BWC
Dulling Edges of Cooperative Parallel Streams
IPG-tuned Parallel Streams
http//scinet.supercomp.org/2003/bwc/results/
24
IPG-tuned Parallel Streams
4 nodes (IPG1023B) share 2.4 Gbps
25
Dulling Edges of Cooperative Parallel Streams
10 nodes share 2.4 Gbps
26
Results of SC2003 BWC
  • SC2003 BWC
  • Bottleneck 2 x OC-48, OC48 (3 x GbE), GbE
  • RTT 335 ms, 326 ms, 292 ms
  • IPG-tuned Parallel streams
  • 16 nodes x 4 streams
  • Maximum throughput 5.42 Gbps
  • Dulling Edges of Cooperative Parallel Streams
  • 32 nodes x 4 streams
  • Maximum throughput 7.01 Gbps
  • Distance x Bandwidth Product and Network
    Technology award

27
Conclusion
  • Transmission Rate Controlled TCP
  • Stabilize and improve performance of single
    stream
  • IPG tuning
  • Static, low overhead, easy to use
  • Dulling Edges of Cooperative Parallel Streams
  • External scheduler in multiple hosts
  • Decelerating faster streams
  • Coarse-grain, lazy adjustment against throughputs
Write a Comment
User Comments (0)
About PowerShow.com