Title: Administering SQL Server 2000: VLDB Best Practices
1DATABASE INDEXES , BLOB VLDB Best
Practices Bernt Castman KTH, Kista Ryan
Trout Microsoft Corporation
2Vad är ett index?
- Särskilt formad datastruktur för att lättare
komma åt data - Lagras tillsammans med data eller separat
- Uppdateras normalt av systemet (ej
fulltextindex) - Utan index måste all dataåtkomst gå via sökning i
tabeller
3 forts
- Genom att använda index kan man reducera antalet
I/O-operationer och snabba upp alla andra
operationer också samt minska antalet
blockeringar och tabell-låsningar - Om en fråga accessar mer än 20 av raderna i en
tabell är en tabellscanning bättre än index.
Detta gäller även om samma värde på indexkolumn
finns i många rader
4Relationship Between an Index and a Table
5 Index är bäst
- Frågor som specificerar ett smalt sökkriteria,
svaret blir ett fåtal rader - Frågor som specificerar en räcka värden, svaret
blir ett fåtal rader - Sökning inom JOIN, kolumner som ofta används i
JOIN är goda indexkandidater - Sökning som vill ha data i en specifik ordning,
utan att behöva sortera - efternamn, förnamn- sortering kräver inte ORDER BY
6 forts
- Indexera med måtta
- Indexera inte små tabeller
- Använd "covering queries" där önskvärt svar finns
inom indexets lagringsutrymme, se mera senare
7 Hur organiseras index?
- Index byggs oftast upp i trädstruktur
- Root node är början av index
- innehåller ett antal indexrader
- dessa rader innehåller nycklar och en pointer
till en indexsida - Trädet kan ha flera indexnivåer
8 Index-Uppdatering
- Index-nyckel avser den eller de kolumner i en
tabell som används för att generera index - Index uppdateras hela tiden
- Vid tex en ny insert i tabellen av en datarad kan
en omfattande ändring behöva göras i indexträdet - På grund av sin karaktär (fast längd etc) ger
detta snabb uppdatering av index även om man
måste skapa utrymme för en ny indexsida - Detta kallas page split och ger ofta
kaskad-effekter på resten av trädet - Kan undvikas genom balansering med "tuning fill
factor"
9Simple Index
- Definieras av en kolumn i en tabell
- Refereras till med SQL-attributet WHERE
- Väljs bara en kolumn där samma data förekommer i
många rader tex "state blir inte index särskilt
effektivt
10Composite Index
- Sammansatt index definieras från mer än en kolumn
tex state, street,city - Kan adresseras genom att använda olika index
- Kan bestå av högst 16 kolumner med max 900 bytes
"data" - Vid frågor med sammansatta index behöver man inte
placera alla indexnycklar i WHERE
11 Exempel
- Index skapat på kolumn a,b och c av en tabell
- Kan sökas med SELECT ... WHERE
- EQUAL a AND b AND c går fortast
- EQUAL a AND b
- EQUAL a ger alltid INDEX SCAN
12INDEX SCAN REAL WORLD A Customer Location Table
- Suppose we have a table that contains information
about the location of customers of our business.
An index is created on the columns state, county,
and city and is stored in the tree structure in
the following order state, county, city. - If a query specifies the state column value as
Texas in the WHERE clause, the index will be
used. Because values for the county and city
columns are not given in the query, the index
will return a number of rows based on all of the
index records that contain Texas as the state
column value. - An index scan is used to retrieve a range of
index pages and then a number of data pages based
on values in the state column. - The index pages are read sequentially in the same
way that a table scan accesses data pages.
13Index Uniqueness
- Ett unikt definerat index blir alltid unikt. Om
kolumnen som avses med indexet är unikt blir det
unika index - Fungerar snabbt och bra då endast en I/O till
behövs för att hitta data. Försök att skapa ett
lika index ger fel - Ett ickeunikt index fungerar på samma sätt men
här tillåts dubbla index med samma värde i
datakolumnen - Används när man inte kan göra unikt index
14Cluster
- Klustrade index lagrar aktuella data av rader i
sina egna noder - så fort rätt data hittats kan det läsas utan I/O
- högre prestanda
- alla data i index-sorterad ordning
- om state/county/city klustrat index söks på state
blir även sorterat på county och city - Om man vill komma till andra data i tabellen blir
det ytterligare en access - Endast ett klustrat index per tabell
15 - Ickeklustrat index pekar på data i en tabell
- Max 249 ickeklustrade index per tabell
- Index skapas via Wizard "Create Index Wizard"
- SQL CREATE UNIQUE CLUSTERED NONCLUSTERED
- INDEX index_name ON table_name
- ( column_name , column_name, column_name, ...
) - WITH options ON filegroup_name
16Fulltext-Index
- Mera lik en katalog, ej i trädstruktur
- Ett fulltextindex måste innehålla en kolumn som
unikt identifierar varje rad - Ett fulltextindex per tabell
- Uppdateras inte automatiskt av systemet
- Ofta mer än text-söknings-maskin,.. fraser- ,
grupper av ord
17 Examples
- SELECT FROM Customers WHERE ContactName
- LIKE 'PETE'
- SELECT FROM Customers WHERE
- CONTAINS(ContactName, ' "PETE" ')
- The CONTAINS predicate can find text strings that
contain the desired string, such as "PETER" or
"PETEY," by using the full-text index. - SELECT CategoryName FROM Categories WHERE
- FREETEXT(Description, 'Sweets candy bread')
- This query might find category names containing
such words as "sweetened," "candied," or "breads."
18 Hints
- When the Query Optimizer generates a query
execution plan, it chooses an index based on
which index will provide the best
performanceusually the index that will use the
fewest I/O operations and retrieve the least
number of rows. - Although Query Optimizer usually chooses the most
efficient query execution plan and access path
for your query, you might be able to do better if
you know more about your data than Query
Optimizer does. For example, suppose you want to
retrieve data about a person named "Smith" from a
table with a column listing last names. Index
statistics generalize based on a column. Suppose
the statistics show that each last name appears
three times on average in the column. This
information provides fairly good selectivity
however, you know that the name "Smith" appears
much more often than average. If you have an idea
about how to better conduct an SQL operation, you
can use a hint. A hint is simply advice you give
to Query Optimizer specifying that it should not
make the automatic choice.
19 Hints cont..
- Table scan In some cases, you might decide that a
table scan would be more efficient than an index
lookup or an index scan. A table scan is more
efficient when the index scan will retrieve more
than 20 percent of the rows in the tablesuch as
when 70 percent of the data is highly selectable
and the other 30 percent is "Smith." - Which index to use? You can specify a particular
index to be the only index considered. You might
not know which index SQL Server Query Optimizer
would choose without your hint, but you feel the
hinted index will perform best. - Which group of indexes to select from You can
suggest several indexes to Query Optimizer, and
it will use all of them (ignoring duplicates).
This option is useful when you know that a set of
indexes will work well. - Locking method You can tell Query Optimizer which
type of lock to use when it is accessing the data
from a particular table. If you feel that the
wrong type of lock might be chosen for this
table, you can specify that Query Optimizer
should use a row lock, a page lock, or a table
lock.
20Hash Files
- Rader i tabell lagras inte sekventiellt på disken
- Hash-funktion beräknar adressen baserat på en
eller flera kolumner i raden - Hur garanterar man en unik adress?
- Förutsättning Man vet något om vilka värden som
kan komma i en kolumn - Overflow-area behövs
21Tex personnummer
- år alla möjliga 1800-2001
- månad 1-12
- dag 1-31
- löpnr 0001-9999
- Hur ser ett företagsnummer ut 556380-4524, dvs
månad och dag-mönstret bryter hash-formeln - Lösning Overflow-area där hash-beräkning ger
samma index som en annan rad, lagras sekventiellt
22 Exempel
- CREATE CLUSTER PropertyCluster
- (propertyNo VARCHAR2(5))
- HASH IS propertyNo HASHKEYS 300000
- när clustret blivit gjort kan vi skapa en tabell
tex - CREATE TABLE PropertyFor Rent
- (propertyNo VARCHAR2(5) PRIMARY KEY, .....)
- CLUSTER PropertyCluster (propertyNo)
23Using Bulk Copy Program (BCP)
- BCP is an external program provided with
Microsoft SQL Server 2000 to facilitate the
loading of data files into a database. BCP can
also be used to copy data out of a SQL Server
table into a data file. - Using the BULK INSERT command The BULK INSERT
Transact-SQL (T-SQL) command lets you copy large
quantities of data from a data file into a SQL
Server table, from within SQL Server. Because
this command is an SQL statement (run within
ISQL, OSQL, or Query Analyzer), the process runs
as a SQL Server thread. You cannot use this
command to copy data from SQL Server into a data
file. - Using Data Transformation Services (DTS) DTS is a
set of tools provided with SQL Server that makes
copying data into and out of SQL Server an easy
task. DTS includes a wizard for importing data
and a wizard for exporting data. - Recovering a database from a backup file can also
be considered a form of database loading - Although staging tables do not provide a method
of loading data, they are commonly used in
database loading
24User table data DB2
- By default, table data is stored on 4 KB pages.
Each page (regardless of page size) contains 76
bytes of overhead for the database manager. This
leaves 4020 bytes to hold user data (or rows),
although no row on a 4 KB page can exceed 4005
bytes in length. A row will not span multiple
pages. You can have a maximum of 500 columns when
using a 4 KB page size
25Long field data DB2
- Long field data is stored in a separate table
object that is structured differently from other
data types - Data is stored in 32 KB areas that are broken up
into segments whose sizes are "powers of two"
times 512 bytes. (Hence these segments can be 512
bytes, 1024 bytes, 2048 bytes, and so on, up to
32 700 bytes.) - Long field data types (LONG VARCHAR or LONG
VARGRAPHIC) are stored in a way that enables free
space to be reclaimed easily. Allocation and free
space information is stored in 4 KB allocation
pages, which appear infrequently throughout the
object. - The amount of unused space in the object depends
on the size of the long field data, and whether
this size is relatively constant across all
occurrences of the data. For data entries larger
than 255 bytes, this unused space can be up to 50
percent of the size of the long field data. - If character data is less than the page size, and
it fits into the record along with the rest of
the data, the CHAR, GRAPHIC, VARCHAR, or
VARGRAPHIC data types should be used instead of
LONG VARCHAR or LONG VARGRAPHIC.
26LOB Data Objects
- Data is stored in 64 MB areas that are broken up
into segments whose sizes are "powers of two"
times 1024 bytes. (Hence these segments can be
1024 bytes, 2048 bytes, 4096 bytes, and so on, up
to 64 MB.) - To reduce the amount of disk space used by LOB
data, you can specify the COMPACT option on the
lob-options clause of the CREATE TABLE and the
ALTER TABLE statements. The COMPACT option
minimizes the amount of disk space required by
allowing the LOB data to be split into smaller
segments. This process does not involve data
compression, but simply uses the minimum amount
of space, to the nearest 1 KB boundary. Using the
COMPACT option may result in reduced performance
when appending to LOB values. - The amount of free space contained in LOB data
objects is influenced by the amount of update and
delete activity, as well as the size of the LOB
values being inserted.
27LOGGING for BLOBs
- NOLOG
- right click at time of definition for objects
(columns) - does not affect other objects (columns) in the
table where data are used)
28Windows DNA 2000Next Generation Web Application
Platform
Application LogicTier
Client Tier
DataTier
Application Services IIS/ASP, COM/MTS,
MSMQ. ADO/OLE DB Data Access, Active Directory,
XML, Commerce
Data Access andStorage
ASP/COM Application
Browser
SQL Server
Rich Client
Devices
Internet (TCP/IP, HTTP, HTML, XML)
29SQL Server 2000VLDB
- Higher Availability
- Lower Administration Costs
- Larger Databases
- Based on customer feedback
- Simplified Planning
- Higher Performance
- Smooth Operations
Keep the Ideas Coming!
30Overview
VLDB Techniques
Performance
Architecture
Features
- Backup Recovery
- Index Maintenance
- Loading and Resizing
- Verification
31Backup and Recovery
- Recovery Models
- Recovery Models and Backup
- Backup Enhancements
- Snapshot Backups
32Log Basics
- Conceptually an infinite stream of log records
- Log files on disk provide a circular cache
- Optimized for sequential writes
- Not an audit-trail
- 10 byte Log Sequence Number (LSN) identifies
location and sequence of a log record.
Monotonically increasing
33Page Structure
LSN, ID, record count, ...
- Header
- LSN identifies log record corresponding to last
page modification - Data records
- Offset table
Record 1
Record 3
Record 2
34Transaction Log Example 1
Roll - back
Done
35Transaction Log Example 2
Record Level Locking
LSN
Done
Log
36Checkpoint
- Anchor in log for recovery
- Lists active transactions
- Flushes hot and old dirty pages to disk
- Limits recovery time
- Allows log to be reused
37Recovery Overview
- 3 passes (phases) to recovery
Log Start
Log/Time
38Recovery Models
- Simplify recovery planning
- Simplify backup and recovery procedures
- Clarify tradeoffs
- Performance
- vs. Data loss exposure
- vs. Log space consumption
- vs. Simplicity
- Simplify transitions between common operational
scenarios - Maintain data protection
- Maintain availability
- Minimize disruption of automated maintenance
procedures
39Recovery Models
- Control Media Recovery Only
- Choosing a Model
- Key Benefits
- Work Loss Exposure
- Functional Limitations
- Setting the model
- How it Works
Recovery Models
Full
Simple
Bulk Logged
ALTER DATABASE SET RECOVERY FULL
40Recovery Models
- Full
- No work loss
- Supports recovery to any arbitrary point-in-time
- Simple
- Simplest backup/restore strategy
- Less log space required
- Greatest work loss possible
- Bulk_Logged
- High performance bulk operations
- Minimal log space for bulk operations
- Some work loss exposure
41How Recovery Models Work
- Full ?Everything is fully logged
- Bulk_Logged ?Minimal logging for some operations
- CREATE INDEX
- Bulk Load
- SELECT INTO
- WRITETEXT, UPDATETEXT
- Not settable per operation due to administration
impact - Simple ?Log truncation on checkpoint
42Log Backups in Bulk_Logged Mode
- Log backup contains extents modified by bulk
logged operations since last log backup - Avoids breaking the log chain
- Eliminates need for special full backup
- ?Minimal impact on automated backups
- ?Log shipping works!
Transaction Log
Log Backup
Data Extents
43Available Backup TypesFor Each Model
44Changing Recovery Model
- FULL? BULK_LOGGED doesnt break automated backup
procedures - Transitions to and from SIMPLE require more care
but are uncommon
45Recovery ModelsVLDB Techniques
- Use
- Full
- Bulk_Logged
- Or both
- Transitions are easy
- Dont use Simple for VLDB
- Limited to database or database differential
backup - Precludes file / filegroup backup
46Database BackupHow it works
- Fuzzy backup
- Writes inconsistent set of pages
- Includes transaction log to make consistent
- Minimal impact on running transactions
- Fast parallel sequential scans, no
transformations to data
Time
Log
Start Oldest Active Transaction
End Data Backup
Old - not backed up
After data written
During data backup
Data
47Log and File Backup
- Flexibility
- Log and file backups dont block each other
- Files are independent of each other
- Must do concurrent file backups as a single
operation
Time
Log
Log
Log
Log
Log
Log
File
File
File
File
48Differential Backups
- Fast bitmap driven
- NewDifferential file backup for VLDB
Bitmap
Extents Changed Since Database Backup
Backup
49Snapshot Backup/Restore
- Equivalent to full database or file backup
- Roll forward using regular differential and log
backups - History maintained in MSDB
- Third party VDI application with storage system
support (split mirror or copy-on-write)
Tape Library
3-way Mirror
Split Mirror
Backup / Passive
Production
Storage
Virtual Device Interface for Backup
50Snapshot Benefits
- Very fast restore
- From mirror in seconds
- Longer from tape
- Backup with the minimum possible impact on
production server - Very fast backup
- Create reporting or test database
- Initialize warm standby
51Backup and RecoveryVLDB Techniques
- Online backup no window needed
- Use file and log backups
- Consider file differential to reduce recovery
time - Backup infrequently modified filegroups less
often - Consider snapshots for high availability
- Very fast restore from disk
- Vary fast backup to disk
52Index Maintenance
- Index Creation
- Analysis
- Online Reorganization
53Parallel Index Creation
- Fast
- Degree of Parallelism Controllable
- How It Works
- Separate thread buildsa sub-index for each range
- Each thread fed by parallel scan
- Sub-indexes stitched together at end
- Optional use of TEMPDB
Complete Index
Sub-index Per Range
Parallel Scans
54Fragmentation Analysis
- Fast mode avoids reading leaf level pages
- Optional tabular results
- Optional check of all index levels
- Optional check of all indexes
?
55Online Reorg
- Low impact
- 20 degradation of medium OLTP workload
- Uses minimal data space
- May be stopped and restarted
- Use analysis to decide when to reorg
56Online Reorg
- Reorders the leaf level of an index
- Reestablishes fill factor
- Pages remain in the same file
- Reorders one index at a time
57Index MaintenanceVLDB Techniques
- Index Creation - Parallel and Concurrent
- Parallel for largest indexes
- Default for SMP
- Concurrent for smaller indexes
- Combination for mixed indexes
- Consider Bulk_Logged recovery model
-
58Index MaintenanceVLDB Techniques
- Maintenance
- Fast analysis
- Online reorganization at appropriate threshold
- Allow for larger log backups
- Slower than rebuilding index
59Loading and Resizing
- Data Loading
- Shrinking the Log
60Bulk Data Loading
- New data types per column collation
- Optional firing of triggers during load
- Constraint checks
- During loading
- New command to check after loading
- Biggest change is operational
- Recovery models
- Easy to trade off performance vs. concurrency
- Easy to maintain recoverability
61Bulk Load Performance
- Scales linearly with number of CPUs
- Improvementnearly 100for each CPU
500 MHz Xeon Typical
62Bulk Data LoadVLDB Techniques
- Initial load into empty tables
- Load Data
- No indexes
- Use BULK INSERT
- Parallel load from partitioned data files
- One load stream per CPU
- Bulk_Logged or Simple recovery model
- TABLOCK option
- Create indexes
- Switch to appropriate recovery model
- Perform backups
63Bulk Data LoadingVLDB Techniques
- For incremental loads
- Load data with indexes in place
- Performance and concurrency requirements
determine locking granularity - Recovery model changes
- Full ? Bulk_Logged recovery model
- Unless need to preserve point-in-time recovery
- Simple (no change)
64Shrinking the Log
Head of Log
- Does what it can immediately
- Manually shrink again after log backup
Head of Log
Head of Log
Log Backup
Target Size
65Shrink DatabaseVLDB Techniques
- Long running shrink database
- May be stopped without losing work
- Reclaim space by using TRUNCATE_ONLY
66Verification
- Optional online check (allow updates)
- Performance 15 transaction throughput drop
during medium workload - Parallel check on SMP systems
- Performance scales linearly at gt80
- Physical check
- Very fast page level check will catchproblems
caused by hardware
CHECKDB, CHECKTABLE, CHECKFILEGROUP
67VerificationVLDB Techniques
- Use physical check
- For full check, allow enough TempDB
- Space use ESTIMATE_ONLY option
- Read / write throughput
- Check part of the database at a time
- Verify individual file groups separately
- Check infrequently modified data less often
Hot
68VerificationWhen?
- Not a part of regular maintenance
- Run at critical times
- Hardware upgrade
- Software upgrade
- Whenever you suspect a problem, regardless of
cause - Run it occasionally forpeace of mind
69Distributed Partitioned Views
- Create identical tables on multiple servers
- Define constraint on each table
- Create UNION VIEW over all tables
- QP optimizes access based on constraints
- View is updateable
- Recoverable to consistent state
Updateable Union View
Tables
70Smooth Operations
- Simple transitions between recovery models
- Operation conflicts eliminated
- E.g., CREATE INDEX vs. Database Backup
- All operations recoverable
- E.g., add file
- Maintain data protection
- Maintain availability
- No disruption of automated maintenance
71Putting It All Together
Time
Full Recovery Model
Full Recovery Model
Bulk_Logged Recovery Model
OLTP
Bulk Load
Create Index
Online Check
Bulk Load
Create Index
Online Reorg
Log Backup
Log Backup
Log Backup
Log Backup
Log Backup
Log Backup
File Backup
File Backup
File Backup
File Backup