Title: Surfing Technology Curves Steve Kleiman CTO Network Appliance Inc.
1Surfing Technology CurvesSteve
KleimanCTONetwork Appliance Inc.
2Book Plug
- The Innovators Dilemma - When New Technologies
Cause Great Firms to Fail - Clayton M. Christensen
3About NetApp
- Two product lines
- Network Attached File Servers (a.k.a. filers)
- Web proxy caches NetCache
- Founded in 1992
- gt1B revenue run rate
- gt70 CAGR since founding
- gt120 last year
4Filers Fast, Simple, Reliable and Multi-protocol
System
Sun E 3500/4500 HP-9000 N4000 NetApp 840
5Filers Fast, Simple, Reliable and Multi-protocol
- Disk management
- Filer finds disks and organizes into RAID groups
and spares automatically - Simple addition of storage
- Automatic RAID reconstruction
- Data management
- Snapshots
- SnapRestore
- SnapMirror
- Simple upgrade
- Small command set
6Filers Fast, Simple, Reliable and Multi-protocol
- Built-in RAID
- Easy hardware maintenance
- Hot plug disk, power, fans
- Low MTTR
- Cluster Failover
- Autosupport
- gt99.995 measured field availability
7Filers Fast, Simple, Reliable and Multi-protocol
- NFS
- CIFS
- CIFS and NFS attributes
- HTTP
- FTP
- DAFS
- Internet Cache
- FTP
- Streaming media
8Wave 1Networks, Appliances and Software
9Network and Storage Bandwidth
Year Storage Network Penalty 1992 10 MB 0.1
MB 100-to-1 1994 20 MB 1 MB 20-to-1 1996 40
MB 10 MB 4-to-1 1998 100 MB 100 MB 1-to-1
2001 200-400 MB 1000 MB
.2-to-1
10The Appliance Revolution
1980s (General Purpose)
1990s (Appliance Based)
11Appliance philosophy
- Appliance philosophy breeds focus
- External simplicity ? internal simplicity
- RISC argument
- Dont have to be all things to all people
- Limited compatibility constraints
- Interfaces are bits on wire
- Think different!
- Can innovate with both software and hardware
12Filer Architecture
- Commercial off-the shelf chips
- Any appropriate architecture
- i486 ? Pentium ? Alpha 064 ? Alpha164 ? PIII
- Board level integration
- 1 or more CPUs (?4)
- 1 or more PCI busses (?4)
- High bandwidth switches
- Multiple memory banks
- Integrated I/O
- NVRAM
13Roads Not Taken
- No unobtainium
- Minimalist infrastructure
- No special purpose busses
- No big MPs
- Motherboards only no cache coherent backplanes
- No functionally distributed computers
- No special purpose networks (e.g. HIPPI)
- No block access protocols
14DataOnTap Architecture
Daemons, Shells, Commands
Java Virtual Machine
Lib
WAFL
RAID
Disk
FCAL
TCP/IP
SCSI
VINIC
VIPL
DAFS
SK
VI supported on FC, (Future GbE, Infiniband)
15DataOnTap
- Simple Kernel
- Message passing
- Non-preemptive
- Sample optimizations
- Checksum caching
- Suspend/Resume
- Cache hit pass through
16WAFL Write Anywhere File Layout
- Log-like write throughput
- No segment cleaning (LFS)
- Write data allocated to optimize RAID performance
- Delayed write allocation
- Active data is never overwritten (shadow paging)
- On-disk data is always consistent
- File system state is changed atomically
- Every 10 sec, by default
- Client modification requests are logged to NVRAM
- NVRAM log is replayed only on reboot
17Wave 2Memory-to-Memory Interconnects
18Problem
- Remove single points of failure
- Without doubling hardware
- Minimizing performance overhead
- Without decreasing reliability
19Clustered Failover Architecture
Network
Filer 2
Filer 1
ServerNet
Fibre Channel
Fibre Channel
20Memory-to-Memory Interconnects
- Efficient transfer model
- Allows minimal overhead on receiver
- Scaleable Bandwidth
- High speed ASIC based switching
- Gigabit technology
- Open architecture
- PCI, not coherent bus interface
- Incorporate multiple technologies
- Relatively inexpensive
21Mirroring NVRAM
- NVRAM is split into local and partner regions
- Data is assembled in NVRAM
- Data is DMAed from NVRAM to equivalent offset in
remote node - Client reply is sent when log entry DMA completes
22Leveraged Components
- Memory-to-Memory interconnects
- Low overhead, high-bandwidth, cheap
- WAFL
- Always consistent file system
- Built-in NVRAM logging/replay
- Fibre Channel disks
- Two independent ports
- Single function appliance software
- Simple, low-overhead failover
23Wave 3The Internet
24The Consequences ofHigher-speed Internet Access
- 200K-400K home cable head-end
- Requires 1.5-3Gbps access capability
- 30 subscription rate, 20 online
- Minimum 128Kbps BW
- Enterprise
- Remote sites still connected by slow links
- Require high-quality access to content
- Overloaded web servers
- ISP
- Require distribution and caching of large media
files
25Yet Another Appliance
Cisco
NetApp
26NetCache
- HTTP/FTP proxy cache appliance
- Highly deployable
- Forward and reverse proxy
- Transparency
- Filtering
- iCAP
- Enables value added services
- Virus scanning, transcoding, ad insertion,
- Stream splitting
- Stream caching
- Content distribution
27Cacheable Content
Cacheable Content
Time
28Wave 4The Death of Tapes
29Using Tapes for Disaster Recovery
30SnapMirror
- Remote asynchronous mirroring
- Continuous incremental update
- Only allocated blocks are transmitted
- Automatic resynchronization after disconnect
- Destination is always a consistent snapshot of
source
WAN
31Creating a Snapshot
Disk Blocks
32WAFL Block Map File
- Multiple bits per 4KB block
- Column for allocated block in the active file
system - Columns for allocated blocks in snapshots
- Taking a Snapshot
- Copy root inode
33Consistent Image Propagation
- Fast Network or Slow Modification Rate
- Slow Network or High Modification Rate
34Wave 5 Local File Sharing andVirtual Interface
Architecture
35ISPs Scalable Services
- Scalability
- Scale compute power and storage independently
- Resiliency
- Cost
- Commodity hardware and Open Systems standards
Load Balancing Switch
ApplicationServers
File Servers
Gigabit Switch
36Database
- Better Manageability
- Offline backup with snapshots
- Replication
- Recovery from snapshots
- Easy storage management
- Equal or better performance
- Less retuning
37Local File Sharing
- Geographically constrained
- 1 or 2 machine rooms
- Mostly homogeneous clients
- Can be large or small
- 1 - 100 machines
- Single administrative control
- High performance applications
- Web service, Cache
- Email, News
- Database, GIS
38Local File Sharing Architecture Characteristics
- Applications tend to avoid OS
- e.g. No virtual memory
- Applications tend to have OS adaptation layer
- Different access protocol requirements
- e.g. high-performance locking, recovery, streaming
39What is VI?
- Virtual Interface (VI) Architecture
- VI architecture organization
- Promoted by Intel, Compaq and Microsoft
- VI Developers Forum
- Standard capabilities
- Send/receive message, remote DMA read/write
- Multiple channels with send/completion queues
- Data transfer bypasses kernel
- Memory pre-registration
40VI Architecture
User
KernelKVIPL client
Kernel
KVIPLModule
VI compliantNIC driver
VI compliantNIC
Hardware
41VI-compliant implementations
- Fibre channel (FC-VI draft standard)
- e.g. Troika, Emulex
- Giganet
- Servernet II
- Infiniband
- Enables 1U MP heads
- Future VI over TCP/IP
42How VI Improves Data Transfer
- No fragmentation, reassembly and realignment data
copies - No user/kernel boundary crossing
- No user/kernel data copies
- Data transfer direct to application buffers
43Direct Access File System
Application
Buffers
File Access API
User
DAFS
VIPL API
VIPL
VI NICDriver
Kernel
NIC
Hardware
VI Provider Layer specification maintained by
the VI Developers Forum
44DAFS Benefits
- File access protocol with implicit data sharing
- Direct application access
- File data transfers directly to application
buffers - Bypasses Operating System
- File semantics
- Optimized for high throughput and low latency
- Consistent high speed locking
- Graceful recovery/failover of clients and servers
- Fencing
- Enhanced data recovery
- Leverages VI for transport independence
45DAFS vs. SAN
Wires
Direct(direct transfer to memory)
Network(TCP/IP)
LocalAttached
SCSI over IP
Block
SAN
Protocols
NAS
DAFS
File
46Summary
- Wave 1 Filers
- Technology Fast networks, commodity servers
- Environment Appliance-ization
- Wave 2 Failover
- Technology Memory-to-memory interconnects, Dual
ported FC disks - Environment 24x7 requirements
- Wave 3 NetCache
- Technology Internet, HTTP
- Environment High BW requirements, POP
deployability
47Summary
- Wave 4 SnapMirror
- Technology Disk areal density, Fibre Channel,
fast networks - Environment Cost of downtime for recovery
- Wave 5 DAFS
- Technology VI architecture
- Environment Local file sharing