Finding Common Ground: The Story of a DB and an OS - PowerPoint PPT Presentation

About This Presentation
Title:

Finding Common Ground: The Story of a DB and an OS

Description:

BUT, new disk might be different than old ones (classic RAID algorithms don't like that) ... The Problem: Classic RAIDs don't work! Time(StripeWrite) = Max(T ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 20
Provided by: remziharpa
Category:
Tags: common | finding | ground | raid | story

less

Transcript and Presenter's Notes

Title: Finding Common Ground: The Story of a DB and an OS


1
Finding Common GroundThe Story of a DB and an OS
  • Remzi H. Arpaci-Dusseau
  • www.cs.wisc.edu/remzi

2
Databases and OSs
  • Databases and OSs Long-time enemies
  • Why? Stonebraker81
  • Isnt time to heal the wounds?
  • Dewitt - No
  • Naughton - Probably not
  • Well, at least we both care about storage!
  • So whats happening in world of storage?
  • Networks!

3
Network-Attached Storage
  • Why bother?
  • Scalable Bandwidth
  • Highly Available
  • Simple/Reliable
  • Expandable
  • Specializable
  • What form will it take?
  • Disk vendors Disks add CPU, network
  • Machine vendors Specialized PCs
  • CPU
  • Network
  • CPU
  • CPU
  • CPU

4
New World, New Problems
  • Goal Build network storage system that...
  • Easy to manage
  • Easy to scale
  • Performs well
  • Implications
  • Plug-and-play Add new disk, utilize tofull
    capacity, with no human intervention
  • BUT, new disk might be different than old ones
    (classic RAID algorithms dont like that)

5
Disks Complexity reigns
  • Additional problem Complex disk drives
  • Multiple zones Outer tracks gt inner
  • Failure masking Stop using bad blocks
  • Not fail-fast Sputter then stop
  • Worse yet Add complex networking!
  • Conclusion In large collections of disks, can no
    longer expect predictable behavior
  • Whats a network storage system to do?
  • Be Adaptive!

6
Solution? WiND
  • Wisconsin Network Disks
  • Distributed systems technology meets storage
  • with Professor A. Arpaci-Dusseau, B. Forney, S.
    Muthukrishnan, F. Popovici, and J. Bent
  • Key Software Components
  • SToRM On-line adaptive layout
  • Clouds Cost-aware caching
  • GALE Off-line selective replication
  • Core technology Information architecture
  • Enables scalable adaptation across all layers

7
Outline
  • Motivation
  • WiND Overview
  • Adaptive layout
  • Caching
  • Long-term reorganization
  • Information architecture
  • Conclusions

8
Adaptive Layout
  • The Problem Classic RAIDs dont work!
  • Time(StripeWrite) Max(T(D0), T(D1))
  • Parallel performance dependency
  • Always runs at rate of slow disk

Raid 0(Striping)
9
SToRM Adaptive Layout
  • Example Adaptive RAID-0
  • Approach Adjust layout per diskaccording to
    perceived rate of operation
  • Key How to obtain information aboutperformance
    of remote disks?
  • SToRM

10
SToRM Status
  • Basic prototype infrastructure in place
  • Linux kernel module on client
  • Simple storage server (replacement NeST)
  • Runs on PC cluster with Gigabit Ethernet
  • Easy stuff works
  • Classic RAID-0 is in place
  • Coming soon Adaptive RAID-0 and more
  • Challenges Meta-data minimization, proper
    server-side interface, support all RAID levels

11
Adaptive Caching w/ Clouds
  • Problem Classic caching algorithms assume
    uniform replacement cost
  • Related work in theory, web, databases
  • How to apply to storage system?
  • Key How to get cost information?
  • Which block to replace?

12
Solution Clouds
  • Clouds flexible caching infrastructure
  • Client-side
  • Server-side
  • Cooperative
  • Two lines of investigation
  • Not just LRU caches anymore
  • Take cost into account for replacement
  • Streaming I/O support
  • Can caches mask the performance of a slow disk?

13
Clouds Status
  • Approach to problem Simulation
  • Infrastructure up and running
  • Simple models used to confirm correctness
  • Whats next?
  • Compare basic caching algorithms withcost-aware
    counterparts
  • Add support for streaming workloads
  • Implement in cluster prototype

14
Off-line Adaptation
  • Problem Adaptation can be short-sighted
  • Example
  • SToRM lays data out according to current rate
  • System characteristics (climate) change
  • Read performance suffers
  • Solution Re-arrange data in background
  • Move data into new layout to match climate
  • Replicate data to add flexibility reliability

15
Solution GALE
  • Long-term optimization engine
  • Use simple rule-based system to enact off-line
    optimizations
  • Example
  • If layout(X) does not match climate(Disks), and
    access(X) is frequent, and load(Disks) is low,
    replicate(X)
  • Key Gathering climate load information
  • Status What you see on this slide...

16
Information Architecture
  • Problem Key to all of WiND is efficient access
    to remote state information
  • Examples
  • How fast is that disk?
  • How much will it cost to get a remote block?
  • Whats needed?
  • Infrastructure for efficient collection of remote
    state information

17
Information Architecture
  • Taxonomy
  • Null Dont use information (e.g., RAID-0)
  • Parasitic Sneak info into existing messages
  • Explicit Add queries for remote state
  • Implicit Infer via observation
  • Key Hide details of information-gathering
    techniques from algorithms
  • IPIs (Information Programming Interfaces)
  • Choose dynamically among best options

18
Future Directions
  • Specialized File Systems
  • SA-NFS (Striped, Adaptive NFS)
  • RiverFS (Parallel FS for database query
    processing primitives)
  • Applications
  • Database query-processing primitives
  • Web cache and proxies
  • Standard NFS workloads

19
Conclusions
  • Storage systems -gt Distributed systems
  • Need to treat them differently
  • Unpredictability will be the norm
  • Complex drives networks -gt Complex behaviors
  • Solution WiND
  • Adaptation via information
  • http//www.cs.wisc.edu/wind
Write a Comment
User Comments (0)
About PowerShow.com