PetaByte Storage Facility at RHIC - PowerPoint PPT Presentation

About This Presentation
Title:

PetaByte Storage Facility at RHIC

Description:

One mover per user group total exposure to single-machine failure. ... To avoid large network traffic merge file servers with HPSS movers: ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 24
Provided by: razv1
Category:

less

Transcript and Presenter's Notes

Title: PetaByte Storage Facility at RHIC


1
PetaByte Storage Facility at RHIC
  • Razvan Popescu - Brookhaven National Laboratory

2
Who are we?
  • Relativistic Heavy-Ion Collider _at_ BNL
  • Four experiments Phenix, Star, Phobos, Brahms.
  • 1.5PB per year.
  • 500MB/sec.
  • gt20,000SpecInt95.
  • Startup in May 2000 at 50 capacity and ramp up
    to nominal parameters in 1 year.

3
Overview
  • Data Types
  • Raw very large volume (1.2PB/yr.), average
    bandwidth (50MB/s).
  • DST average volume (500TB), large bandwidth
    (200MB/s).
  • mDST low volume (lt100TB), large bandwidth
    (400MB/s).

4
Data Flow (generic)
ReconstructionFarm (Linux)
RHIC
raw
35MB/s
DST
raw
10MB/s
Archive
File Servers(DST/mDST)
50MB/s
DST
200MB/s
mDST
mDST
400MB/s
10MB/s
AnalysisFarm (Linux)
5
The Data Store
  • HPSS (ver. 4.1.1 patch level 2)
  • Deployed in 1998.
  • After overcoming some growth difficulties we
    consider the present implementation successful.
  • One major/total reconfiguration to adapt to new
    hardware (and system understanding).
  • Flexible enough for our needs. One shortage
    preemptable priority schema.
  • Very high performance.

6
The HPSS Archive
  • Constraints - large capacity high bandwidth
  • Two types of tape technology SD-3 (best /GB)
    9840 (best /MB/s).
  • Two tape layers hierarchies. Easy management of
    the migration.
  • Reliable and fast disk storage
  • FC attached RAID disk.
  • Platform compatible with HPSS
  • IBM, SUN, SGI.

7
Present Resources
  • Tape Storage
  • (1) STK Powderhorn silo (6000 cart.)
  • (11) SD-3 (Redwood) drives.
  • (10) 9840 (Eagle) drives.
  • Disk Storage
  • 8TB of RAID disk.
  • 1TB for HPSS cache.
  • 7TB Unix workspace.
  • Servers
  • (5) RS/6000 H50/70 for HPSS.
  • (6) E450E4000 for file serving and data mining.

8
(No Transcript)
9
(No Transcript)
10
(No Transcript)
11
HPSS Structure
  • (1) Core Server
  • RS/6000 Model H50
  • 4x CPU
  • 2GB RAM
  • Fast Ethernet (control)
  • OS mirrored storage for metadata (6pv.)

12
HPSS Structure
  • (3) Movers
  • RS/6000 Model H70
  • 4x CPU
  • 1GB RAM
  • Fast Ethernet (control)
  • Gigabit Ethernet (data) (15009000MTU)
  • 2x FC attached RAID - 300GB - disk cache
  • (3-4) SD-3 Redwood tape transports
  • (3-4) 9840 Eagle tape transports

13
HPSS Structure
  • Guarantee availability of resources for a
    specific user group ? separate resources ?
    separate PVRs movers.
  • One mover per user group ? total exposure to
    single-machine failure.
  • Guarantee availability of resources for Data
    Acquisition stream ? separate hierarchies.
  • Result 2PVR2COS1Mvr per group.

14
HPSS Structure
15
HPSS Topology
16
HPSS Performance
  • 80 MB/sec for the disk subsystem.
  • 1 CPU per 40MB/sec for TCPIP Gbit traffic _at_
    1500MTU or 90MB/sec _at_ 9000MTU
  • gt9MB/sec per SD-3 transport.
  • 10MB/sec per 9840 transport.

17
I/O Intensive Systems
  • Mining and Analysis systems.
  • High I/O moderate CPU usage.
  • To avoid large network traffic merge file servers
    with HPSS movers
  • Major problem with HPSS support on non-AIX
    platforms.
  • Several (Sun) SMP machines or Large (SGI) Modular
    System.

18
Problems
  • Short lifecycle of the SD-3 heads.
  • 500 hours lt 2 months _at_ average usage. (6 of 10
    drives in 10 months).
  • Built a monitoring tool to try to predict
    transport failure (based of soft error
    frequency).
  • Low throughput interface (F/W) for SD-3 high
    slot consumption.
  • SD-3 production discontinued?!
  • 9840 ???

19
Issues
  • Tested the two tape layer hierarchies
  • Cartridge based migration.
  • Manually scheduled reclaim.
  • Work with large files. Preferable 1GB. Tolerable
    gt200MB.
  • Is this true with 9840 tape transports?
  • Dont think at NFS. Wait for DFS/GPFS?
  • We use exclusively pftp.

20
Issues
  • Guarantee avail. of resources for specific user
    groups
  • Separate PVRs movers.
  • Total exposure to single-mach. failure !
  • Reliability
  • Distribute resources across movers ? share movers
    (acceptable?).
  • Inter-mover traffic
  • 1 CPU per 40MB/sec TCPIP per adapter Expensive!!!

21
Inter-Mover Traffic - Solutions
  • Affinity.
  • Limited applicability.
  • Diskless hierarchies (not for DFS/GPFS).
  • Not for SD-3. Not enough tests on 9840.
  • High performance networking SP switch. (This is
    your friend.)
  • IBM only.
  • Lighter protocol HIPPI.
  • Expensive hardware.
  • Multiply attached storage (SAN). Most promising!
    See STKs talk. Requires HPSS modifications.

22
Summary
  • HPSS works for us.
  • Buy an SP2 and the SP switch.
  • Simplified admin. Fast interconnect. Ready for
    GPFS.
  • Keep an eye on the STKs SAN/RAIT.
  • Avoid SD-3. (not a risk anymore)
  • Avoid small file access. At least for the moment.

23
Thank you!
  • Razvan Popescupopescu_at_bnl.gov
Write a Comment
User Comments (0)
About PowerShow.com