A Study of Caching in Parallel File Systems - PowerPoint PPT Presentation

1 / 57
About This Presentation
Title:

A Study of Caching in Parallel File Systems

Description:

Scientific inquiry is now information intensive ... CPU performance is increasing faster than disk performance ... Often metadata intensive. Data Analysis or ... – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 58
Provided by: bradleyws
Category:

less

Transcript and Presenter's Notes

Title: A Study of Caching in Parallel File Systems


1
A Study of Caching in Parallel File Systems
  • Dissertation Proposal
  • Brad Settlemyer

2
Trends in Scientific Research
  • Scientific inquiry is now information intensive
  • Astronomy, Biology, Chemistry, Climatology,
    Particle Physics all utilize massive data sets
  • Data sets under study are often very large
  • Genomics Databases (50 TB and growing)
  • Large Hadron Collider (15 PB/yr)
  • Time spent manipulating data often exceeds time
    spent performing calculations
  • Checkpointing I/O demands are particularly
    problematic

3
Typical Scientific Workflow
  • Acquire data
  • Observational Data (sensor-based, telescope,
    etc.)
  • Information Data (gene sequences, protein
    folding)
  • Stage/Reorganize data to fast file system
  • Archive retrieval
  • Filtering extraneous data
  • Process data (e.g. Feature Extraction)
  • Output results data
  • Reorganize data for visualization
  • Visualize Data

4
Trends in Supercomputing
  • CPU performance is increasing faster than disk
    performance
  • Multicore CPUs and increased intra-node
    parallelism
  • Main memories are large
  • 4GB cost lt 100.00
  • Networks are fast and wide
  • gt10Gb network and buses available
  • Num Application Processes is increasing rapidly
  • RoadRunner gt 128K concurrent processes achieving
    gt1 Petaflop
  • BlueGene/P gt 250K concurrent processes achieving
    gt1 Petaflop

5
I/O Bottleneck
  • Application processes are able to construct I/O
    requests faster than the storage system can
    provide service
  • Applications are unable to fully utilize the
    massive amounts of available computing power

6
Parallel File Systems
  • Addresses I/O bottleneck by providing
    simultaneous access to large number of disks

7
PFS Data Distribution
Logical File Data
Strip A Strip B Strip C Strip D Strip E Strip F
Physical Data Locations
Strip A Strip E
Strip B Strip F
Strip D
Strip C
PFS Server 0
PFS Server 3
PFS Server 2
PFS Server 1
8
Parallel File Systems (cont.)
  • Aggregate file system bandwidth requirements
    largely met
  • Large, aligned data requests can be rapidly
    transferred
  • Scalable to hundreds of client processes and
    improving
  • Areas of inadequate performance
  • Metadata Operations (Create, Remove, Stat)
  • Small Files
  • Unaligned Accesses
  • Structured I/O

9
Scientific Workflow Performance
  • Acquire or Simulate Data
  • Primarily limited by physical bandwidth
    characteristics
  • Move or Reorganize Data for Processing
  • Often metadata intensive
  • Data Analysis or Reconstruction
  • Small, unaligned accesses perform poorly
  • Move/Reorganize Data for visualization
  • May perform poorly (small, unaligned accesses)
  • Visualize Data
  • Benefits from reorganization

10
Alleviating the I/O bottleneck
  • Avoid data reorganization costs
  • Additional work that does not modify results
  • Limits use of high level libraries
  • Increase contiguity/granularity
  • Interconnects and parallel file systems are well
    tuned for large contiguous file accesses
  • Limits use of low latency messaging available
    between cores
  • Improve locality
  • Avoid device accesses entirely
  • Difficult to achieve in user applications

11
Benefits of Middleware Caching
  • Improves locality
  • PVFS Acache and Ncache
  • Improve write-read and read-read accesses
  • Small accesses
  • Can bundle small accesses into compound operation
  • Alignment
  • Can compress accesses by performing aligned
    requests
  • Transparent to application programmer

12
Proposed Caching Techniques
  • In order to improve the performance of smalland
    unaligned file accesses, we propose middleware
    designed to enhance parallel file systems with
    the following
  • Shared, Concurrent Access Caching
  • Progressive Page Granularity Caching
  • MPI File View Caching

13
Shared Caching
  • Single data cache per node
  • Leverages trend toward large numbers of cores
  • Improves contiguity of alternating request
    patterns
  • Concurrent access
  • Single Reader/Writer
  • Page locking system

14
File Write Example
Process 0 I/O Requests
Process 1 I/O Requests
Logical File
15
File Write Example
Process 0 I/O Requests
Process 1 I/O Requests
Logical File
16
File Write Example
Process 0 I/O Requests
Process 1 I/O Requests
Logical File
17
File Write Example
Process 0 I/O Requests
Process 1 I/O Requests
Logical File
18
File Write Example
Process 0 I/O Requests
Process 1 I/O Requests
Logical File
19
File Write w/ Cache
Process 0 I/O Requests
Process 1 I/O Requests
Cache Page 0
Cache Page 2
Cache Page 1
Logical File
20
File Write w/ Cache
Process 0 I/O Requests
Process 1 I/O Requests
Cache Page 0
Cache Page 2
Cache Page 1
Logical File
21
File Write w/ Cache
Process 0 I/O Requests
Process 1 I/O Requests
Cache Page 0
Cache Page 2
Cache Page 1
Logical File
22
File Write w/ Cache
Process 0 I/O Requests
Process 1 I/O Requests
Cache Page 0
Cache Page 2
Cache Page 1
Logical File
23
File Write w/ Cache
Process 0 I/O Requests
Process 1 I/O Requests
Cache Page 0
Cache Page 2
Cache Page 1
Logical File
24
Progressive Page Caching
  • Benefits of paged caching
  • Efficient for the file system
  • Reduces cache metadata overhead
  • Issues with paged caching
  • Aligned pages may retrieve more data than
    otherwise required
  • Unaligned writes do not cache easily
  • Read the remaining page fragment
  • Do not update cache with small writes
  • Progressive paged caching addresses issues while
    minimizing performance and metadata overhead

25
Unaligned Access Caches
  • Accesses are independent and not on page
    boundaries
  • Requires increased cache overhead
  • How to organize unaligned data
  • List I/O Tree
  • Binary Space Partition Tree

26
Paged Cache Organization
Logical File
Logical File
Logical File
Logical File
27
BSP Tree Cache Organization
Logical File
8
12
4
1
5
11
2
0
28
List I/O Tree Cache Organization
Logical File
5,3
10,2
2,2
0,1
29
Progressive Page Organization
Logical File
2,2
1,3
2,2
0,1
Logical File
Logical File
Logical File
30
View Cache
  • MPI provides a more descriptive facility for
    describing file I/O
  • Collective I/O
  • MPI provides file views for describing file
    subregions
  • Use file views as a mechanism for coalescing
    reads and writes during collective I/O
  • How to take the union of multiple views.
  • Use a heuristic approach to detect structured I/O

31
Collective Read Example
Process 0 I/O Requests
Process 1 I/O Requests
Logical File
32
Collective Read Example
Process 0 I/O Requests
Process 1 I/O Requests
Logical File
33
Collective Read Example
Process 0 I/O Requests
Process 1 I/O Requests
Logical File
34
Collective Read w/ Cache
Process 0 I/O Requests
Process 1 I/O Requests
Cache Block 0
Cache Block 2
Cache Block 1
Logical File
35
Collective Read w/ Cache
Process 0 I/O Requests
Process 1 I/O Requests
Cache Block 0
Cache Block 2
Cache Block 1
Logical File
36
Collective Read w/Cache
Process 0 I/O Requests
Process 1 I/O Requests
Cache Block 0
Cache Block 2
Cache Block 1
Logical File
37
Collective Read w/ Cache
Process 0 I/O Requests
Process 1 I/O Requests
Cache Block 0
Cache Block 2
Cache Block 1
Logical File
38
Collective Read w/ Cache
Process 0 I/O Requests
Process 1 I/O Requests
Cache Block 0
Cache Block 2
Cache Block 1
Logical File
39
Collective Read w/ ViewCache
Process 0 I/O Requests
Process 1 I/O Requests
Cache Block 0
Cache Block 2
Cache Block 1
Logical File
40
Collective Read w/ ViewCache
Process 0 I/O Requests
Process 1 I/O Requests
Cache Block 0
Cache Block 2
Cache Block 1
Logical File
41
Collective Read w/ ViewCache
Process 0 I/O Requests
Process 1 I/O Requests
Cache Block 0
Cache Block 2
Cache Block 1
Logical File
42
Collective Read w/ ViewCache
Process 0 I/O Requests
Process 1 I/O Requests
Cache Block 0
Cache Block 2
Cache Block 1
Logical File
43
Collective Read w/ ViewCache
Process 0 I/O Requests
Process 1 I/O Requests
Cache Block 0
Cache Block 2
Cache Block 1
Logical File
44
Study Methodology
  • Simulation-based study
  • HECIOS
  • Closely modelled on PVFS2 and Linux
  • 40,000 sloc
  • Leverages OMNeT, INET Framework
  • Cache Organizations
  • Core Sharing
  • Aligned Page access
  • Unaligned page access

45
HECIOS Overview
  • HECIOS System Architecture

46
HECIOS Overview (cont.)
  • HECIOS Main Window

47
HECIOS Overview (cont.)
  • HECIOS Simulation Top View

48
HECIOS Overview (cont.)
  • HECIOS Simulation Detailed View

49
Contributions
  1. HECIOS, the High End Computing I/O Simulator
    developed and made available under open source
    license.
  2. Flash I/O and BT-IO traced at large scale and
    traces now publicly available
  3. Rigorous study of caching factors in parallel
    file system
  4. Novel cache designs for unaligned file access and
    MPI view coalescing

50
The End
  • Thank You For Your Time!
  • Questions?
  • Brad Settlemyer
  • bradles_at_clemson.edu

51
Dissertation Schedule
  • August Complete trace parser enhancements.
    Shared cache impl. Complete trace collection.
  • September Aligned cache sharing study.
  • October Unaligned cache sharing study.
  • November SigMetrics deadline. View coalescing
    cache.
  • December Finalize experiments. Finish writing
    thesis. Defend thesis.

52
PVFS Scalability
  • Read and Write Bandwidth Curves for PVFS

53
Shared Caching (cont.)
Process 0 I/O Requests
Process 1 I/O Requests
Cache Page 0
Cache Page 2
Cache Page 1
Logical File
54
Bandwidth Effects
Write Bandwidth on Adenine (MB/sec) Write Bandwidth on Adenine (MB/sec) Write Bandwidth on Adenine (MB/sec) Write Bandwidth on Adenine (MB/sec)
Num Clients PVFS w/ 8 IONodes PVFS w/ Replication 16 IONodes Percent Performance
1 10.3 9.8 95.1
4 28.2 28.7 101.8
8 43.4 39.8 91.5
16 43.4 40.3 92.9
32 50.1 38.2 76.2
55
Experimental Data Distribution
Logical File Data
Strip A Strip B Strip C Strip D Strip E Strip F
Physical Data Locations
Strip A Strip E
Strip B Strip F
Strip D
Strip C
Strip A Strip E
Strip D
Strip C
Strip B Strip F
PFS Server 0
PFS Server 3
PFS Server 2
PFS Server 1
56
Discussion (cont.)
Logical File Data
Strip A Strip B Strip C Strip D Strip E Strip F
Physical Data Locations
Strip A Strip E
Strip B Strip F
Strip D
Strip C
Strip A
Strip D
Strip C Strip F
Strip B Strip E
PFS Server 0
PFS Server 3
PFS Server 2
PFS Server 1
57
Process 0
Process 1
Process 2
Process 3
CPU Nodes
Switched Network
I/O Nodes
PFS Server 0
PFS Server 3
PFS Server 2
PFS Server 1
Write a Comment
User Comments (0)
About PowerShow.com