Midterm 2: April 28th - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

Midterm 2: April 28th

Description:

LRU-K, article by O'Neils and Weikum. Continuous Media, article by ... 20 Megabytes of accounting data requires 21 days and costs $19K to reproduce. ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 23
Provided by: vishalt
Category:

less

Transcript and Presenter's Notes

Title: Midterm 2: April 28th


1
Midterm 2 April 28th
  • Material
  • Query processing and Optimization, Chapters 12
    and 13 (ignore 12.5.5, 12.7, 13.4.4 and 13.5)
  • Transactions, Chapter 14
  • Concurrency Control, Chapter 15, ignore 15.7 to
    15.10
  • Recovery System, Chapter 16, ignore 16.8 and 16.9
  • Google File System
  • LRU-K, article by ONeils and Weikum
  • Continuous Media, article by Ghandeharizadeh
    Muntz (1st 11 pages)
  • COSAR-CQN

2
Enterprise Data Mangement
  • Shahram Ghandeharizadeh
  • Computer Science Department
  • University of Southern California

3
Challenge Managing Data is Expensive
  • Cost of Managing Data is 100K/TB/Year
  • Down time is estimated at thousands of dollars
    per minute.
  • Loss of data results in lost productivity
  • 20 Megabytes of accounting data requires 21 days
    and costs 19K to reproduce.
  • 50 of companies that lose their data due to a
    disaster never re-open 90 go out of business in
    2 years!

4
Centralize Management of Storage
  • Before
  • Data stored locally.
  • After
  • Data stored across the network at a central
    location.

Data
Network
Data
5
Centralize Management of Storage
  • Advantages
  • Many clients share storage and data data
    remains available when a client fails.

Network
Data
6
Centralize Management of Storage
  • Advantages
  • Many clients share storage and data.
  • Redundancy is implemented in one place protecting
    all clients from disk failure.

Network
7
Centralize Management of Storage
  • Advantages
  • Many clients share storage and data.
  • Redundancy is implemented in one place protecting
    all clients from disk failure.
  • Centralized backup The administrator does not
    care/know how many clients are on the network
    sharing storage.

Network
8
Centralize Management of Storage
  • Advantages
  • Many clients share storage and data.
  • Redundancy is implemented in one place protecting
    all clients from disk failure.
  • Centralized backup The administrator does not
    care/know how many clients are on the network
    sharing storage.

Data Sharing
High Availability
Network
Data Backup
9
(No Transcript)
10
(No Transcript)
11
Network failures
  • What about network failures?
  • Two host bus adapters per server,
  • Each server connected to a different switch.

12
Centralize Management of Storage
  • Storage Area Network (SAN)
  • Block level access,
  • Write to storage is immediate,
  • Specialized hardware including switches, host bus
    adapters, disk chassis, battery backed caches,
    etc.
  • Expensive
  • Supports transaction processing systems.
  • Network Attached Storage (NAS)
  • File level access,
  • Write to storage might be delayed,
  • Generic hardware,
  • In-expensive,
  • Not appropriate for transaction processing
    systems.

13
Storage Area Network
  • Centralize management of storage
  • Storage Area Networks (SANs),
  • Redundancy in data to tolerate disk failures,
  • Regular backup,
  • Disaster recovery.

14
Concepts and Terminology
  • Virtualization
  • Available storage is represented as one HUGE disk
    drive, e.g., a SAN with a thousand 1.5 TB disk
    provides 1 Petabyte of storage,
  • Available storage is partitioned into Logical
    Unit Numbers (LUNs),
  • A LUN is presented to one or more servers,
  • A LUN appears as a disk drive to a server.
  • SAN places blocks across physical disks
    intelligently to balance load.

15
Question
  • Is it possible to present the same LUN to two
    different servers simultaneously?

16
Question
  • Is it possible to present the same LUN to two
    different servers simultaneously? YES!
  • Can two different servers read and write the
    files stored on the presented LUN?

17
Question
  • Is it possible to present the same LUN to two
    different servers simultaneously? YES!
  • Can two different servers read and write the
    files stored on the presented LUN? Yes!
  • What are the consequences?

18
Concepts Backup
  • Snapshot State of a LUN at one instance in
    time.
  • Copy-on-write
  • A snapshot consists of the original blocks of a
    LUN,
  • Every time an application writes a block, SAN
    generates a new copy for the current LUN
    (snapshot maintains the original),
  • Advantage copy of blocks in support of backup
    is generated on-demand.

19
Copy-on-Write
  • Original LUN and Snapshot taken midnight Sunday
    morning.

5
6
7
1
2
3
4
20
Copy-on-Write
  • Original LUN and Snapshot taken midnight Sunday
    morning.
  • Write block 5 changes the current LUN to
  • As blocks are written, the physical blocks of the
    snapshot materialize.

6
7
Old 5
1
2
3
4
5
21
Hot Standby
  • An in-expensive server that is maintained on the
    side to assume responsibility for a failed
    server.
  • Goal Minimize downtime.

22
Summary
  • SAN and NAS are shared-disk architecture,
  • SAN is appropriate for transaction processing
    systems,
  • Hardware alone is not a substitute for a
    parallel, high performance transaction processing
    system, e.g., Teradata, Oracle RAC, etc.
Write a Comment
User Comments (0)
About PowerShow.com