ECE 6160: Advanced Computer Networks SAN - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

ECE 6160: Advanced Computer Networks SAN

Description:

Each 'cloud' may resize or reconfigure independently. ... Reconfigure virtual disks. ECE6160:Advanced Computer Networks. 20. Data Placement & Redundancy ... – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 36
Provided by: ben57
Category:

less

Transcript and Presenter's Notes

Title: ECE 6160: Advanced Computer Networks SAN


1
ECE 6160 Advanced Computer NetworksSAN
  • Instructor Dr. Xubin (Ben) He
  • Email Hexb_at_tntech.edu
  • Tel 931-372-3462
  • Course web http//www.ece.tntech.edu/hexb/616f05

2
Prev
  • Networked storage
  • NAS

3
Storage Architectures
4
Storage Area Networks
5
SAN connection
  • FC
  • FC-SAN
  • LAN (Ethernet)
  • IP-SAN
  • iSCSI
  • Other networks
  • Petal (ATM)

6
Typical SAN
  • Backup solutions (tape sharing)
  • Disaster tolerance solutions (distance to remote
    location)
  • Reliable, maintainable, scalable infrastructure

7
A real SAN.
8
NAS and SAN shortcomings
  • SAN Shortcomings--Data to desktop--Sharing
    between NT and UNIX--Lack of standards for file
    access and locking
  • NAS Shortcomings--Shared tape resources--Number
    of drives--Distance to tapes/disks
  • NAS--Focuses on applications, users, and the
    files and data that they share
  • SAN--Focuses on disks, tapes, and a scalable,
    reliable infrastructure to connect them
  • NAS Plus SAN--The complete solution, from
    desktop to data center to storage device

9
NAS plus SAN.
  • NAS Plus SAN--The complete solution, from
    desktop to data center to storage device

10
Petal/Frangipani
NFS
NAS
Frangipani
SAN
Petal
11
Petal/Frangipani
Untrusted OS-agnostic
FS semantics Sharing/coordination
Disk aggregation (bricks) Filesystem-agnostic Re
covery and reconfiguration Load balancing Chained
declustering Snapshots Does not control sharing
Each cloud may resize or reconfigure
independently. What indirection is required to
make this happen, and where is it?
12
Remaining Slides
  • The following slides have been borrowed from the
    Petal and Frangipani presentations, which were
    available on the Web until Compaq SRC dissolved.
    This material is owned by Ed Lee, Chandu
    Thekkath, and the other authors of the work. The
    Frangipani material is still available through
    Chandu Thekkaths site at www.thekkath.org.
  • For ECE6160, several issues are important
  • Understand the role of each layer in the
    previous slides, and the strengths and
    limitations of each layer as a basis for
    innovating behind its interface (NAS/SAN).
  • Understand the concepts of virtual disks and a
    cluster file system embodied in Petal and
    Frangipani.
  • Understand how the features of Petal simplify the
    design of a scalable cluster file system
    (Frangipani) above it.

13
Petal Distributed Virtual Disks
  • Systems Research Center
  • Digital Equipment Corporation
  • Edward K. Lee
  • Chandramohan A. Thekkath

11/12/2009
14
Logical System View
AdvFS
NT FS
PC FS
UFS
Scalable Network
Petal
15
Physical System View
Parallel Database or Cluster File System
Scalable Network
/dev/shared1
16
Virtual Disks
  • Each disk provides 264 byte address space.
  • Created and destroyed on demand.
  • Allocates disk storage on demand.
  • Snapshots via copy-on-write.
  • Online incremental reconfiguration.

17
Virtual to Physical Translation
Server 0
Server 1
Server 2
Server 3
Virtual Disk Directory
vdiskID
offset
GMap
PMap0
PMap1
PMap2
PMap3
(disk, diskOffset)
18
Global State Management
  • Based on Leslie Lamports Paxos algorithm.
  • Global state is replicated across all servers.
  • Consistent in the face of server network
    failures.
  • A majority is needed to update global state.
  • Any server can be added/removed in the presence
    of failed servers.

19
Fault-Tolerant Global Operations
  • Create/Delete virtual disks.
  • Snapshot virtual disks.
  • Add/Remove servers.
  • Reconfigure virtual disks.

20
Data Placement Redundancy
  • Supports non-redundant and chained-declustered
    virtual disks.
  • Parity can be supported if desired.
  • Chained-declustering tolerates any single
    component failure.
  • Tolerates many common multiple failures.
  • Throughput scales linearly with additional
    servers.
  • Throughput degrades gracefully with failures.

21
Chained Declustering
Server0
Server1
Server2
Server3
D0
D1
D2
D3
D3
D0
D1
D2
D4
D5
D6
D7
D7
D4
D5
D6
22
Chained Declustering
Server0
Server1
Server2
Server3
D0
D2
D3
D1
D3
D1
D2
D0
D4
D6
D7
D5
D7
D5
D6
D4
23
The Prototype
  • Digital ATM network.
  • 155 Mbit/s per link.
  • 8 AlphaStation Model 600.
  • 333 MHz Alpha running Digital Unix.
  • 72 RZ29 disks.
  • 4.3 GB, 3.5 inch, fast SCSI (10MB/s).
  • 9 ms avg. seek, 6 MB/s sustained transfer rate.
  • Unix kernel device driver.
  • User-level Petal servers.

24
The Prototype

src-ss1
src-ss2
src-ss8
/dev/vdisk1
/dev/vdisk1
/dev/vdisk1
/dev/vdisk1

petal1
petal2
petal8
25
Throughput Scaling
26
Virtual Disk Reconfiguration
8 servers
6 servers
virtual disk w/ 1GB of allocated storage 8KB
reads writes
27
Frangipani A Scalable Distributed File System
  • C. A. Thekkath, T. Mann, and E. K. Lee
  • Systems Research Center
  • Digital Equipment Corporation

28
Why Not An Old File System on Petal?
  • Traditional file systems (e.g., UFS, AdvFS)
    cannot share a block device
  • The machine that runs the file system can become
    a bottleneck

29
Frangipani
  • Behaves like a local file system
  • multiple machines cooperatively managea Petal
    disk
  • users on any machine see a consistentview of
    data
  • Exhibits good performance, scaling, and load
    balancing
  • Easy to administer

30
Ease of Administration
  • Frangipani machines are modular
  • can be added and deleted transparently
  • Common free space pool
  • users dont have to be moved
  • Automatically recovers from crashes
  • Consistent backup without halting the system

31
Components of Frangipani
  • File system core
  • implements the Digital Unix vnode interface
  • uses the Digital Unix Unified Buffer Cache
  • exploits Petals large virtual space
  • Locks with leases
  • Write-ahead redo log

32
Locks
  • Multiple reader/single writer
  • Locks are moderately coarse-grained
  • protects entire file or directory
  • Dirty data is written to disk before lock is
    given to another machine
  • Each machine aggressively caches locks
  • uses lease timeouts for lock recovery

33
Logging
  • Frangipani uses a write ahead redo log for
    metadata
  • log records are kept on Petal
  • Data is written to Petal
  • on sync, fsync, or every 30 seconds
  • on lock revocation or when the log wraps
  • Each machine has a separate log
  • reduces contention
  • independent recovery

34
Recovery
  • Recovery is initiated by the lock service
  • Recovery can be carried out on any machine
  • log is distributed and available via Petal

35
References
  • E. Lee and C. Thekkath, Petal Distributed
    Virtual Disks, Proceedings of the international
    conference on Architectural support for
    programming languages and operating systems
    (ASPLOS 1996)
  • P. Sarkar, S. Uttamchandani, and K. Voruganti,
    Storage Over IP When Does Hardware Support
    Help? Proc. of 2nd USENIX Conference on File And
    Storage Technologies (FAST2003)
  • C. Thekkath, T. Mann, and E. Lee, Frangipani A
    scalable distributed file system, Proceedings of
    the 16th ACM Symposium on Operating Systems
    Principles (SOSP), pp. 224-237, October 1997
Write a Comment
User Comments (0)
About PowerShow.com