Disconnected Operation in the Coda File System - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Disconnected Operation in the Coda File System

Description:

We are back to 1990s. Network is slow and not stable ... IBM Thinkpad 10 yrs anniversary. Do we still need disconnection? How many people are using coda? ... – PowerPoint PPT presentation

Number of Views:83
Avg rating:3.0/5.0
Slides: 29
Provided by: yun7
Category:

less

Transcript and Presenter's Notes

Title: Disconnected Operation in the Coda File System


1
Disconnected Operation in the Coda File System
  • J. Kistler and M. Satyanarayanan
  • Presented by Yun Mao (maoy_at_cis.upenn.edu)

2
Background
  • We are back to 1990s.
  • Network is slow and not stable
  • Terminal ? powerful client
  • 33MHz CPU, 16MB RAM, 100MB hard drive
  • Mobile Users appeared
  • 1st IBM Thinkpad in 1992
  • We can do sth at client without network

3
Motivation
4
Disconnected Operation
  • Continue critical work when that repository is
    inaccessible.
  • Key idea caching data.
  • Performance
  • Availability
  • Server Replication

5
An Example
6
An Example
7
An Example
8
An Example
9
An Example
10
An Example
11
Design Rationale
  • Scalability
  • Callback cache coherence (inherit from AFS)
  • Whole file caching
  • Fat clients. (security, integrity)
  • Avoid system-wide rapid change
  • Portable workstations
  • Users assistance in cache management

12
Design Rationale -Replication
  • Server replication (why?)
  • Persistent, Secure physically
  • - Expensive
  • Client replication
  • - Low quality relatively
  • Cheap

13
Design Rationale Replica Control
  • Pessimistic
  • Disable all partitioned writes
  • - Require a client to acquire control of a cached
    object prior to disconnection
  • Optimistic
  • Assuming no others touching the file
  • sophisticated conflict detection
  • fact low write-sharing in Unix
  • high availability access anything in range

14
Implementation - architecture
15
Venus - states
16
Hoarding
  • Hoard useful data for disconnection
  • Balance the needs of connected and disconnected
    operation.
  • Cache size is restricted
  • Unpredictable disconnections
  • Prioritized algorithm cache manage
  • hoard walking reevaluate objects

17
Prioritized algorithm
  • User defined hoard priority p how interest it
    is?
  • Recent Usage q
  • Object priority f(p,q)
  • Kick out the one with lowest priority
  • Fully tunable
  • Everything can be customized
  • - Not tunable (?)
  • No idea how to customize

18
Hoard Walking
  • Equilibrium uncached obj lt cached obj
  • Why it may be broken? Cache size is limited.
  • Walking restore equilibrium
  • Reloading HDB (changed by others)
  • Reevaluate priorities in HDB and cache
  • Enhanced callback
  • Increase scalability, and availability
  • Decrease consistency

19
Emulation
  • Act like a server
  • Record modified objects
  • Replay update activity Preparation
  • Log based per volume
  • Persistence
  • Meta-data ? RVM
  • Exhaustion
  • Compress?

20
Reintegration
  • Replay algorithm
  • Execute in parallel to all AVSG
  • Transaction based
  • Succeed?
  • Yes. Free logs, reset priority
  • No. Save logs to a tar. Ask for help

21
Conflict Handling
  • Only care write/write confliction
  • File vs Directory
  • File Halt entire reintegration process
  • Dir investigate more
  • Manual repair

22
Coda Evaluation
  • Hardware
  • 386 laptop, IBM Decstation 3100s
  • 350MB disk
  • How ?
  • How long does reintegration take?
  • How large a local disk does one need?
  • How likely are conflicts?

23
Answers
  • Duration of Reintegration
  • A few hours disconnection -gt1 min
  • Cache size
  • 100MB at client is enough for a typical workday
  • Conflicts
  • No Conflict at all! Why?
  • Over 99 modification by the same person
  • Two users modify the same obj within a day
    lt0.75

24
Conclusion
  • Disconnected operation is a simple idea
  • Hard to implement in each stage
  • Why?
  • An extended version of write-back cache?
  • A critical data pre-fetched write-back cache
  • Feasible, efficient and usable.

25
Remember this slide?
  • We are back to 1990s.
  • Network is slow and not stable
  • Terminal ? powerful client
  • 33MHz CPU, 16MB RAM, 100MB hard drive
  • Mobile Users appear
  • 1st IBM Thinkpad in 1992

26
Whats now?
  • We are in 2000s now.
  • Network is fast and reliable in LAN
  • powerful client ? very powerful client
  • 2.4GHz CPU, 1GB RAM, 120GB hard drive
  • Mobile Users everywhere
  • IBM Thinkpad 10 yrs anniversary
  • Do we still need disconnection?
  • How many people are using coda?
  • a script of rsync, Unison (Penn)

27
Do we still need disconnection?
  • WAN and wireless is not very reliable, and is
    slow
  • PDA is not very powerful
  • 200MHz strongARM, 128M CF Card
  • Electric power constrained
  • LBFS (MIT) on WAN, Coda and Odyssey (CMU) for
    mobile users
  • Adaptability is also cared.

28
What is the future?
  • We are in 2010s now
  • High bandwidth, reliable wireless everywhere
  • Even PDA is powerful
  • 2GHz, 1G RAM/Flash
  • Unlimited kinetic or solar energy (?)
  • What will be the research topic in FS?
  • P2P?
Write a Comment
User Comments (0)
About PowerShow.com