Belle/Gfarm Grid Experiment at SC04 - PowerPoint PPT Presentation

About This Presentation
Title:

Belle/Gfarm Grid Experiment at SC04

Description:

National Institute of Advanced Industrial Science and Technology ... grep. Gfarm command. File replica creation, node & process information ... – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 20
Provided by: osam92
Category:
Tags: belle | experiment | gfarm | grep | grid | sc04

less

Transcript and Presenter's Notes

Title: Belle/Gfarm Grid Experiment at SC04


1
Belle/Gfarm Grid Experiment at SC04
APAN Workshop Jan 27, 2005 Bangkok
  • Osamu Tatebe
  • Grid Technology Research Center, AIST

2
Goal and feature of Grid Datafarm
  • Goal
  • Dependable data sharing among multiple
    organizations
  • High-speed data access, High-performance data
    computing
  • Grid Datafarm
  • Gfarm File System Global dependable virtual
    file system
  • Federates scratch disks in PCs
  • Parallel distributed data computing
  • Associates Computational Grid with Data Grid
  • Features
  • Secured based on Grid Security Infrastructure
  • Scalable depending on data size and usage
    scenarios
  • Data location transparent data access
  • Automatic and transparent replica selection for
    fault tolerance
  • High-performance data access and computing by
    accessing multiple dispersed storages in parallel
    (file affinity scheduling)

3
Grid Datafarm (1) Gfarm file system - World-wide
virtual file system CCGrid 2002
  • Transparent access to dispersed file data via
    global namespace
  • Files can be stored somewhere in a Grid
  • Applications can access Gfarm file system without
    any modification as if it were mounted at /gfarm
  • Automatic and transparent replica selection for
    fault tolerance and access-concentration avoidance

Globalnamespace
mapping
Gfarm File System
File replica creation
4
Grid Datafarm (2) High-performance data access
and computing support CCGrid 2002
Users view
Physical execution view
User A submits that accesses
is allocated on a node that has
File A
File A
Job A
Job A
User B submits that accesses
is allocated on a node that has
File B
File B
Job B
Job B
network
Computational Grid
CPU
CPU
CPU
CPU
Gfarm File System
Compute and file system nodes
Shared network file system
Do not separate Storage and CPU
Scalable file I/O by exploiting local I/O
5
GfarmTM Data Grid middleware
  • Open source development
  • GfarmTM version 1.0.4-4 released on Jan 11th,
    2005 (http//datafarm.apgrid.org/)
  • Read-write mode support, more support for
    existing binary applications
  • A shared file system in a cluster or a grid
  • Accessibility from legacy applications without
    any modification
  • Standard protocol support by scp, GridFTP server,
    samba server, . . .

Metadata server
application
gfmd
slapd
Existing applications can access Gfarm file
system without any modification using LD_PRELOAD
Gfarm client library
CPU
CPU
CPU
CPU
gfsd
gfsd
gfsd
gfsd
. . .
Compute and file system nodes
6
Demonstration
  • File manipulation
  • cd, ls, cp, mv, cat, . . .
  • grep
  • Gfarm command
  • File replica creation, node process information
  • Remote (parallel) program execution
  • gfrun prog args . . .
  • gfrun -N procs prog args . . .
  • gfrun -G filename prog args . . .

7
Belle/Gfarm Grid experimentat SC2004
  • 1. Online KEKB/Belle distributed data analysis
  • 2. KEKB/Belle large-scale data analysis
  • (terabyte-scale US-Japan file replication)

8
1. Online KEKB/Belle distributed data analysis (1)
  • Online distributed and parallel data analysis of
    raw data using AIST and SDSC clusters
  • Realtime histogram and event data display at
    SC2004 conference hall

Raw data
10 MB/sec
SC2004
Gfarm file system
KEK
  • realtime histogram display
  • realtime event data display

192 nodes 53.75 TB
SDSC
AIST
128 nodes 3.75 TB
64 nodes50 TB
  • on demand data replication
  • distributed parallel data analysis

9
1. Online KEKB/Belle distributed data analysis (2)
  • Construct a shared network file system between
    Japan and US
  • Store KEKB/Belle raw data to the Gfarm file
    system
  • Physically, it is divided into N fragments, and
    stored on N different node
  • Every compute node can access it as if it were
    mounted at /gfarm

Raw data
10 MB/sec
SC2004
KEK
  • realtime histogram display
  • realtime event data display

Gfarm File System
SDSC
AIST
192 nodes 53.75 TB
128 nodes 3.75 TB
64 nodes50 TB
10
1. Online KEKB/Belle distributed data analysis (3)
  • Basf is installed at /gfarm//belle
  • Install once, run everywhere
  • The raw data will be analyzed at AIST or SDSC
    just after it is stored
  • Analyzed data can be viewed at SC2004 in realtime
  • Histogram snapshot is generated every 5 minutes

Raw data
10 MB/sec
Computational Grid
SC2004
KEK
  • realtime histogram display
  • realtime event data display

Gfarm File System
SDSC
AIST
192 nodes 53.75 TB
128 nodes 3.75 TB
64 nodes50 TB
11
2. KEKB/Belle large-scale data analysis in a Grid
  • Gfarm file system using SC conference hall and
    AIST F cluster
  • Assume data is stored at SC conference hall
  • Terabyte-scale mock data
  • Submit data analysis job at AIST F cluster
  • Required data is automatically transferred from
    SC to AIST on demand
  • Users just see a shared file system
  • Network transfer rate is measured
  • Conf 1 8 parallel processes (2GBx8 data)
  • Conf 2 16 parallel processes (8GBx16 data)

12
2. Network machine configuration
JGN2 Tsukuba WAN
SC2004 StorCloud
AIST F cluster
JGN2 Japan-US
10Gbps
10Gbps
10Gbps
Tokyo
Chicago
Pittsburgh
13
SC?AIST (Iperf x 8)
7,925,607,155 bps (Wed Nov 10 171322 JST 2004)
(5-sec average bandwidth, 991 Mbps / TCP flow)
14
Iperf measurement
  • Standard TCP (Linux 2.4)
  • Socket buffer size and txqueuelen
  • No kernel patch, no TCP modification
  • No traffic shaping
  • No bottleneck, no problem

15
Conf 1 8 processes (2GBx8)
2,084,209,307 bps (Fri Nov 12 034154 JST 2004)
(5-sec average, 261 Mbps / TCP flow, disk
performance of F cluster)
16
Conf 2 16 processes (8GBx16)
738,920,649 bps (Fri Nov 12 053035 JST 2004)
(5-sec average, 46 Mbps!! / TCP flow, ?????)
17
Conf 2 network traffic of JGN2 intl link
Heavy traffic when application started
Heavy packet loss?ssthresh decreases
18
Summary
  • Belle/Gfarm Grid experiment at SC2004
  • 1. Online KEKB/Belle distributed data analysis
  • 2. KEKB/Belle large-scale data analysis
  • We succeeded distributed parallel data analysis
    of KEKB/Belle data and realtime display at SC
    conference hall

19
Development Status and Future Plan
  • Gfarm Grid file system
  • Global virtual file system
  • A dependable network shared file system in a
    cluster or a grid
  • High performance data computing support
  • Associates Computational Grid with Data Grid
  • Gfarm v1 Data Grid middleware
  • Version 1.0.4-4 released on Jan 11, 2005
    (http//datafarm.apgrid.org/)
  • Existing programs can access Gfarm file system as
    if it were mounted at /gfarm
  • Gfarm v2 towards true global virtual file
    system
  • POSIX compliant - supports read-write mode,
    advisory file locking, . . .
  • Performance and Robustness improved, Security
    enhanced.
  • Can be substituted for NFS, AFS, . . .
  • Application area
  • Scientific application (High energy physics,
    Astronomic data analysis, Bioinformatics,
    Computational Chemistry, Computational Physics, .
    . .)
  • Business application (Dependable data computing
    in eGovernment and eCommerce, . . .)
  • Other applications that needs dependable file
    sharing among several organizations
  • Standardization effort with GGF Grid File System
    WG (GFS-WG)

https//datafarm.apgrid.org/
Write a Comment
User Comments (0)
About PowerShow.com