Title: Information Resources Management
1Information Resources Management
2Agenda
- Administrivia
- Physical Database Design
- Database Integrity
- Performance
3Administrivia
4Regrade Requests Exam SQL
- Create Database
- Enter query(s) as submitted
- Submit to me
- Database (electronic)
- Graded homework (paper)
- Reserve the right to change test data and
reexecute query
5Foreign Keys
- Inserts require all FK values be the value of a
primary key in the reference table - Update and delete constraints are also possible
6Referential Integrity
- ON DELETE CASCADE/RESTRICT/SET NULL
- ON UPDATE CASCADE/RESTRICT/SET NULL
- Default
- ON DELETE RESTRICT
- ON UPDATE CASCADE
7Example
- CREATE TABLE PCAccess
- (PC INTEGER,
- EmpID CHAR(9),
- AccessType CHAR(15),
- PRIMARY KEY (PC, EmpID),
- FOREIGN KEY (EmpID) REFERENCES (Employee),
- FOREIGN KEY (PC) REFERENCES (PC))
8Example - PCAccess Table
9Example 1
INSERT INTO PCAccess (PC) VALUES (4)
10Example 2
INSERT INTO PCAccess (PC, EmpID) VALUES (4,5)
11Example 3
UPDATE Employee SET EmpID 10 WHERE EmpID 1
12Example 4
UPDATE PCAccess SET EmpID 10 WHERE EmpID 1
13Example 5
DELETE FROM Employee WHERE EmpID 2
14Example 6
DELETE FROM PCAccess WHERE EmpID 2
15Example 7
DELETE FROM PC WHERE PC 3
16Cascading
- Chain followed until the end
- Especially for deletes
- If mix of CASCADE, RESTRICT, SET NULL
- Will get all or nothing
17Update Delete Constraints
- CREATE TABLE T1
- (A CHAR(5)
- B CHAR(5)
- PRIMARY KEY (A,B)
- FOREIGN KEY (A) REFERENCES (T2)
- ON DELETE RESTRICT
- ON UPDATE RESTRICT)
- CREATE TABLE T2
- (C CHAR(5)
- D VARCHAR(30) PRIMARY KEY (C))
18Constraints
T1
T2
Want to update the value of X1 to be X11. What
has to happen?
19Performance
- Requires Knowledge of
- DBMS
- Applications
- Data
- Users Expectations
- Environment
20Performance Classes
- OLTP
- On-Line Transaction Processing
- OLAP
- On-Line Analytic Processing
- Mix of OLTP and OLAP
21OLTP
- Throughput Driven
- Throughput - number of transactions per unit of
time - Lots of Transactions
- Mix of Update and Query
- High Concurrency
22OLAP
- Response Time Driven
- Response Time - single transaction
- Very Large, Possibly Complex, Transactions
- Query Evaluation and Optimization
23Performance Tuning
- Consider the Mix of OLTP OLAP
- Interference Between Types
- Example
- Single daily large analytic transaction, rest
simple transactions, locking could prevent others
from running.
24Tuning Levels
- DBMS
- Hardware
- Design
- Interactions Between Levels
25DBMS Parameter Tuning
- Specific to DBMS
- Buffers - Buffer Pool
- Logging - Checkpoints
- Lock Management
- Space Allocation - Log, Data, Freespace
- Thread Management
- Operating System Tuning
26Hardware Tuning
- Memory
- CPU
- Disk
- RAID
- Number of Drives
- Partitioning
- Architecture -- Parallel Systems?
27RAID
- Redundant Array of Inexpensive Disks
- Appears as single disk
- Physical storage difference - no database
differences - Increase performance
- Provide recovery from disk failure
28Negative Effect of RAID
- MTBF (mean time between failures)
- Increase by factor of drives used
29How RAID Works
- Striping - dividing equally across all disks
1
2
3
4
1
2
3
4
Stripe 1
5
6
7
8
Stripe 2
9
10
11
12
Stripe 3
Stripe n
30RAID Levels
- RAID-0
- RAID-1
- RAID-2
- RAID-3
- RAID-4
- RAID-5
- RAID-6
31RAID-0
- All disks store unique data
- Very fast
- No fault tolerance or recovery
32RAID-1
- Fully Redundant
- Faster Reads/Slower Writes
- High fault tolerance -- easy recovery
33RAID-2
- Each record spans all drives
- Some disks store ECC (error correction codes)
- Parity checks allow error detection and correction
34RAID-3
- Each record spans all drives
- One disk stores ECC
- Single-User
35RAID-4
- Each record stored on a single disk
- One drive for ECC
- Multi-user reads Single-User writes
36RAID-5 (Rotating Parity Array)
- Drive has both data and ECC
- ECCs rotate to different drive
- Multi-user reads and writes
1
2
3
4
1
2
3
ECC
ECC
4
5
6
ECC
6
7
8
ECC
11
10
9
ECC
12
13
14
37RAID-6 PQ Redundancy
- P - parity
- Q - extra parity
- 2 bits of ECC per 4 bits of data
- Handles multiple disk failures
- Reed-Solomon codes
- Introduction to the Theory of Error-Correcting
Codes, Pless (1989)
38Your Mileage May Vary
We note that numerous improvements have been
proposed to the basic RAID schemes described
here. As a result, sometimes there is confusion
about the exact definitions of the different RAID
levels.
39RAID Usage
- 1, 3, and 5 outperform others
- RAID-1 - fastest, no storage cost, but not fault
tolerant - RAID-3 - single-user only
- RAID-5 - higher speed than single disk, fault
tolerant, multi-user, but some storage cost and
slower write times
40Design Tuning
- Transactions
- Physical Database
41Transaction Tuning
- The DBMS optimizes so why worry?
- An optimized poorly written transaction can
always be outperformed by a well-written
nonoptimized one. - EXPLAIN (DB2)
- What did the optimizer come up with?
42Transaction Tuning
- Distributed Databases
- Client-Server
- Network performance becomes an additional concern
43Transaction Tuning
- DBA participation in program reviews and
walkthroughs - Continuous Monitoring
44Transaction Tuning Heuristics
- Single query instead of multiple queries
- multiple includes sub-queries
- Avoid long-running transactions
- Avoid large quantities of updates
- Locking and logging
- Reduce number of tables joined
45Transaction Tuning Heuristics
- Reduce sorting
- Return less data rather than more
- Dont shift logic from query to program
- Optimizer is likely to be faster
- Less data is returned
46Transaction Tuning Example
- Get the names of all managers whose offices have
property listed in Pgh. - SELECT
- FROM Property as P, Office as O, Manager as M,
Employee as E - WHERE P.OfficeNbr O.OfficeNbr AND O.OfficeNbr
M.OfficeNbr AND M.EmpID E.EmpID AND - PropertyID IN (SELECT PropertyID FROM Property
as P2 WHERE P2.City Pgh AND P2.OfficeNbr
M.OfficeNbr)
47Transaction Tuning Example
- SELECT
- FROM Property as P, Office as O, Manager as M,
Employee as E - WHERE P.OfficeNbr O.OfficeNbr AND O.OfficeNbr
M.OfficeNbr AND M.EmpID E.EmpID AND - PropertyID IN (SELECT PropertyID FROM Property
as P2 WHERE P2.City Pgh AND P2.OfficeNbr
M.OfficeNbr) - More is selected than is needed.
48Transaction Tuning Example
- SELECT
- FROM Property as P, Office as O, Manager as M,
Employee as E - WHERE P.OfficeNbr O.OfficeNbr AND O.OfficeNbr
M.OfficeNbr AND M.EmpID E.EmpID AND - PropertyID IN (SELECT PropertyID FROM Property
as P2 WHERE P2.City Pgh AND P2.OfficeNbr
M.OfficeNbr) - Some joined tables can be eliminated.
49Transaction Tuning Example
- SELECT
- FROM Property as P, Office as O, Manager as M,
Employee as E - WHERE P.OfficeNbr O.OfficeNbr AND O.OfficeNbr
M.OfficeNbr AND M.EmpID E.EmpID AND - PropertyID IN (SELECT PropertyID FROM Property
as P2 WHERE P2.City Pgh AND P2.OfficeNbr
M.OfficeNbr) - Subquery is executed once per office.
50Transaction Tuning Example
- SELECT E.Name
- FROM Employee as E
- WHERE E.MgrFlag 1 AND OfficeNbr IN
- (SELECT OfficeNbr FROM Property as P
- WHERE P.City Pgh)
- Version without any joins - 2 single table
queries only.
51Transaction Tuning Example
- SELECT DISTINCT E.Name
- FROM Property as P, Employee as E
- WHERE P.OfficeNbr E.OfficeNbr AND E.MgrFlag
1 AND - P.City Pgh
- Single query with join.
52Transaction Tuning
- Explain (or similar tool) can help to identify
how each transaction will access the data and
what temporary tables will have to be created to
execute the query - With multiple options, test them
- Order of conditions in WHERE can affect the
optimization and performance - I.E., put MgrFlag 1 first
53Physical Database Tuning
- Indices
- Schema Tuning
- Retaining Normalization
- Denormalization
54Indices
- Unique
- Nonunique
- Single Attribute
- Multiple Attributes
- (concatenated or composite key)
- Primary Key
- Secondary Index
55Additional Indices
- Index decreases read time but increases update
time - Based on queries - even single query
- (EXPLAIN)
- Indices need reorganization
- Inserts, Updates, Deletes
- Specify freespace
- Reduce frequency of reorganizations
56Schema Tuning - Staying Normal
- Split Tables - Vertical Partitioning
- Highly used vs. infrequently used columns
- Dont partition if result will be more joins
- Keys are duplicated
57Schema Tuning -Staying Normal
- Variable length fields (VARCHAR, others)
- Indeterminant record lengths
- Row locations vary
- Vertically partition row into two tables, one
with fixed and one with variable columns
58Schema Tuning -Leaving Normal
- Normalization
- Eliminates duplication
- Reduces anomalies
- Does not result in efficiency
- Denormalize for performance
59Denormalization Warnings
- Increases chance of errors or inconsistencies
- May result in reprogramming if business rules
change - Optimizes based on current transaction mix
- Increases duplication and space required
- Increases programming complexity
- Always normalize first then denormalize
60Denormalization
- Partition Rows
- Combine Tables
- Combine and Partition
- Replicate Data
61Combining Opportunities
- One-to-one (optional)
- allow nulls
- Many-to-many (assoc. entity)
- 2 tables instead of 3
- Reference data (one-to-many)
- one not use elsewhere
- few of many
62Combining Examples
- Employee-Spouse (name and SSN only)
- Owner-PctOwned - Property
- few owners with multiple properties
- Property-Type (description)
- one type per property
63Partitioning
- Horizontal
- By row type
- Separate processing by type
- Supertype/subtype decision
- Vertical (already seen)
- Both
64Replication
- Intentionally repeating data
- Example Owner-PctOwned-Property
- Owner includes PctOwned PropertyID
- Property includes majority OwnerSSN and PctOwned
65Performance Tuning
- Not a one-time event
- Monitoring probably more important
- Things change
- applications, database (table) sizes, data
characteristics - hardware, operating system, DBMS