Transmission Rate Controlled TCP in Data Reservoir Software control approach Mary Inaba - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Transmission Rate Controlled TCP in Data Reservoir Software control approach Mary Inaba

Description:

Fujitsu Laboratories. Fujitsu Computer Technologies. M.Inaba. University of Tokyo. Grape6 ... Dream Computing System for real Scientists. Fast CPU, huge memory ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 27
Provided by: datareser
Category:

less

Transcript and Presenter's Notes

Title: Transmission Rate Controlled TCP in Data Reservoir Software control approach Mary Inaba


1
Transmission Rate Controlled TCPin Data
Reservoir - Software control approach -Mary
Inaba
  • University of Tokyo
  • Fujitsu Laboratories
  • Fujitsu Computer Technologies

2
Data intensive scientific computation through
global networks
X-ray astronomy Satellite ASUKA
Nobeyama Radio Observatory (VLBI)
Nuclear experiments
Belle Experiments
Data Reservoir
Very High-speed Network
Digital Sky Survey
Distributed Shared files
Data Reservoir
SUBARU Telescope
Data Reservoir
Local Accesses
Grape6
Data analysis at University of Tokyo
3
Research Projects with Data Reservoir
4
Dream Computing System for real Scientists
  • Fast CPU, huge memory and disks, good graphics
  • Cluster technology, DSM technology, Graphics
    processors
  • Grid technology
  • Very fast remote file accesses
  • Global file system, data parallel file systems,
    Replication facilities
  • Transparency to local computation
  • No complex middleware, no or small modification
    to existing software
  • Real Scientists are not computer scientists
  • Computer scientists are not work forces for real
    scientists

5
Objectives of Data Reservoir
  • Sharing Scientific Data between distant research
    institutes
  • Physics, astronomy, earth science, simulation
    data
  • Very High-speed single file transfer on Long Fat
    pipe Network
  • gt 10 Gbps, gt 20,000 Km (12,500 miles), gt 400ms
    RTT
  • High utilization of available bandwidth
  • Transferred file data rate gt 90 of available
    bandwidth
  • Including header overheads, initial negotiation
    overheads
  • OS and File system transparency
  • Storage level data sharing (high speed iSCSI
    protocol on stock TCP)
  • Fast single file transfer

6
Basic Architecture
High latency Very high bandwidth Network
Data Reservoir
Disk-block level Parallel and Multi-stream
transfer
Local file accesses
Cache Disks
Data Reservoir
Distribute Shared Data (DSM like architecture)
Local file accesses
Cache Disks
7
Data Reservoir Features
  • Data sharing in low-level protocol
  • Use of iSCSI protocol
  • Efficient data transfer (optimization of disk
    head movements)
  • File system transparency
  • Single file image
  • Multi-level striping for performance scalability
  • Local file accesses through LAN
  • Global disk transfer through WAN

Unified by iSCSI protocol
8
File accesses on Data Reservoir
Scientific Detectors
User Programs
1st level striping
File Server
File Server
File Server
File Server
Disk access by iSCSI
IP Switch
IP Switch
2nd level striping
Disk Server
Disk Server
Disk Server
Disk Server
IBM x345 (2.6GHz x 2)
9
Global Data Transfer
10
BW behavior
Data Reservoir
Transfer through A file system
Bandwidth(Mbps)
Bandwidth(Mbps)
Time (sec)
Time (sec)
11
Problems of BWC2002 experiments
  • Low TCP bandwidth due to packet losses
  • TCP congestion window size control
  • Very slow recovery from fast recovery phase
    (gt20min)
  • Unbalance among parallel iSCSI streams
  • Packet scheduling by switches and routers
  • User and other network users have interests only
    to total behavior of parallel TCP streams

12
Fast Ethernet vs. GbE
  • Iperf in 30 seconds
  • Min/Avg Fast Ethernet gt GbE

FE
GbE
13
Packet Transmission Rate
  • Bursty behavior
  • Transmission in 20ms against RTT 200ms
  • Idle in rest 180ms

Packet loss occurred
14
Packet Spacing
  • Ideal Story
  • Transmitting packet
  • every RTT/cwnd
  • 24µs interval for
  • 500Mbps (MTU 1500B)
  • High load for software
  • only
  • Low overhead because of
  • limited use at slow start phase

RTT
RTT/cwnd
15
Example Case of 8 IPG
  • Success on Fast Retransmit
  • Smooth Transition to Congestion Avoidance
  • CA takes 28 minutes to recover to 550Mbps

16
Best Case of 1023B IPG
  • Like Fast Ethernet case
  • Proper transmission rate
  • Spurious Retransmit due to Reordering

17
Performance Divergence on LFN
  • Parallel streams
  • Difference grows adversely
  • Slowest stream determines total performance

18
Unbalance within parallel TCP streams
  • Unbalance among parallel iSCSI streams
  • Packet scheduling by switches and routers
  • Meaningless unfairness among parallel streams
  • User and other network users have interests only
    to total behavior of parallel TCP streams
  • Our approach
  • Constant Scwnd i for fair TCP network usage to
    other users
  • Balance each cwnd i communicating between
    parallel TCP streams

BW
BW
time
time
19
BW2003 US-Japan experiments
  • 24000 km (15,000 miles) distance (400ms RTT)
  • Phoenix ? Tokyo ? Portland ? Tokyo
  • OC-48 x 3 OC-192
    OC-192
  • GbE x 1
  • Transfer 1TB file
  • 32 servers, 128 iSCSI disks

DR
DR
10G Ether x 2
10G Ether
GbE x 4
OC-48 x 2
Phoenix
Tokyo
Seattle
Tokyo
Chicago
Portland
L.A.
OC-48
N.Y.
OC-192
OC-192
GbE
Abilene
IEEAF/ WIDE
IEEAF/ WIDE
NTT Com, APAN, SUPER-SINET
20
24,000km(15,000miles)
OC-48 x 3 GbE x 4
OC-192
15,680km (9,800miles)
8,320km (5,200miles)
Juniper T320
21
SC2002
  • BWC2002
  • 560Mbps (200ms RTT)
  • 95 Utilization of available bandwidth
  • U. of Tokyo ? Scinet (Maryland, USA)
  • ? Data Reservoir can saturate 10Gbps network when
    it will be available for US-JAPAN connection

22
Results
  • BWC2002
  • Tokyo ? Baltimore 10,800km (6,700miles)
  • Peak bandwidth (on network) 600 Mbps
  • Average file transfer bandwidth 560 Mbps
  • Bandwidth-distance products 6,048
    terabit-kilometers/second
  • BWC results (pre-test)
  • Phoenix ? Tokyo ? Portland ? Tokyo 24,000 km
    (15,000 miles)
  • Peak bandwidth (on network) gt (8 Gbps)
  • Average file transfer bandwidth gt (7 Gbps)
  • Bandwidth-distance products gt (168
    petabit-kilometers/second)
  • More than 25 times improvement from BWC2002
    performance (bandwidth-distance products)

23
Bad News
  • Network cut-down on 11/8
  • US-Japan north route connection has been
    completely out of order
  • 23 weeks are necessary to repair the under-sea
    fibers.
  • Planned BW 11.2 Gbps (OC48 x 3
    GbE x 4)
  • Actual maximum BW ? 8.2 Gbps (OC48 x 3
    GbE x 1)

24
How your science benefits from high performance,
high bandwidth networking
  • Easy and transparent access to remote scientific
    data
  • Without special programming (normal NFS style
    accesses)
  • Purely software approach with IA servers
  • Utilization of high-BW network for his data
  • 17 minutes for 1TB file transfer from the
    opposite location on earth
  • High utilization factor (gt 90)
  • Good for both scientists and network agencies
  • Scientists can concentrate on his research topics
  • Good for both Scientists and Computer Scientists

25
Summary
  • The most distant data transfer at BWC2003 (24,000
    km)
  • Software techniques for improving efficiency and
    stability
  • Transfer Rate Control on TCP
  • CWND balancing on parallel TCP
  • Based on stock TCP algorithm
  • Possibly highest bandwidth-distance products for
    file transfer between two points
  • Still high utilization of available bandwidth

26
BWC 2003 Experiment is supported by
NTT / VERIO
Write a Comment
User Comments (0)
About PowerShow.com