Title: A MultiSource TimeVariant Datawarehouse Case Study
1AMulti-Source Time-Variant DatawarehouseCase
Study
Session 36605byArup NandaProligence,
Inc. Norwalk, CT
2Objectives
- Exploring DW Techniques in Oracle
- Case Study
- Oracle 10G Additions
3A Real Life Case
- Claims Datawarehouse
- Several Customers/Sources
- Several Quarters
- Data Volume Was High
- Irregular Frequency
- Data Comes Often Late
- Near Real Time Requirements
4DB1
DB2
Cust1
Cust11
?
DB6
Cust10
Cust2
Cust9
DB3
Cust3
Cust8
Datawarehouse
Cust4
Cust7
DB4
DB5
Cust5
Cust6
5Problem of Irregular Data
Detail Table
Summary Table
Detail Table
CUST2
6Problems
- Incoming Data Irregular
- Summary Tables Need Refreshing
- Quarters Added Continuously
- Archival Requirements Vary Across Customers
- Quick Retrieval of Archival Needed
7Problems contd.
- Summary on Summary Tables as Materialized Views
- Need Refresh Whenever New Data Arrives
- Or When Data is Purged/Reinstated
- Customers Added and Deleted Frequently
8Objective
- To Minimize Downtime for Refreshes
- Incrementally Refresh
- Partitioning Techniques
- To Add Customers Easily
- To Add Quarters Easily
- To Archive Off and Purge Easily and Atomically
- To Restore Archives Quickly
9Objective contd.
- To have an ETL Setup for Easy Addition of Objects
Such As Tables, Indexes, Mat Views. - Use Only Available Oracle and Unix Tools
- PL/SQL
- Unix Shell Scripts
- SQLPlus
10Design
- Varying Dimensions
- Customer
- Quarter
- Composite Partitioning
- Range (for Quarters)
- List (for Customers)
- Local Indexes
11Partitioning
- Partitioned on CLAIM_DATE
- RANGE
- Partitioned named YyyQq
- Storage Clauses Not Defined
- Supartitioned on CUST_NAME
- LIST
- Named YyyQq_CustName, e.g. Y03Q3_CUST1
12Indexing
- All Indexes Local
- CREATE INDEX IN_CLAIM_SUM_01
- LOCAL
- ON SUMTAB1 (COL1, COL2)
- No Indexes UNIQUE and GLOBAL
13Storage
- Each Subpartition of Index or Table is kept in
separate tablespaces named in the format - YltYeargtQltQtrgt_ltCustNamegt_DATA
- e.g. Y02Q2_CUST1_DATA
- Y02Q2_CUST2_DATA
- Y03Q3_CUST1_DATA
-
14Table
Cust3 Y03 Q3
Customers
Quarter
In Tablespace Y03Q3_CUST3_DATA
15Tablespace
- create tablespace y03q3_cust1_data
- datafile /oradata/y03q3_cust1_data_01.dbf
- size 500m
- autoextend on next 500m
- extent management local
- segment space management auto
16Table DDL
- CREATE TABLE TAB1
- ( )
- PARTITION BY RANGE (CLAIM_DATE)
- SUBPARTITION BY LIST (CUST_NAME)
- (
- PARTITION Y03Q1 VALUES LESS THAN
(TO_DATE(2003/04/01,YYYY/MM/DD)), - (
- SUBPARTITION Y03Q1_CUST1 VALUES (CUST1)
TABLESPACE Y03Q1_CUST1_DATA, - SUBPARTITION Y03Q1_CUST2 VALUES (CUST2)
TABLESPACE Y03Q1_CUST2_DATA, - and so on for all subpartitions
- SUBPARTITION Y03Q1_DEF VALUES (DEFAULT)
TABLESPACE USER_DATA - ),
- PARTITION Y03Q2 VALUES LESS THAN
(TO_DATE(2003/07/01,YYYY/MM/DD)), - (
- SUBPARTITION Y03Q2_CUST1 VALUES (CUST1)
TABLESPACE Y03Q2_CUST1_DATA, - SUBPARTITION Y03Q2_CUST2 VALUES (CUST2)
TABLESPACE Y03Q2_CUST2_DATA, - and so on for all subpartitions
- SUBPARTITION Y03Q2_DEF VALUES (DEFAULT)
TABLESPACE USER_DATA - ),
17Index DDL
- CREATE INDEX IN_TAB1_01
- ON TAB1 (COL1)
- LOCAL NOLOGGING
- (
- PARTITION Y03Q1 (
- SUBPARTITION Y03Q1_CUST1 TABLESPACE
Y03Q1_CUST1_INDX, - SUBPARTITION Y03Q1_CUST2 TABLESPACE
Y03Q1_CUST2_INDX, - and so on for all subpartitions
- SUBPARTITION Y03Q1_DEF TABLESPACE USER_DATA
- ),
- PARTITION Y03Q2 (
- SUBPARTITION Y03Q2_CUST1 TABLESPACE
Y03Q2_CUST1_INDX, - SUBPARTITION Y03Q2_CUST2 TABLESPACE
Y03Q2_CUST2_INDX, - and so on for all subpartitions
- SUBPARTITION Y03Q2_DEF TABLESPACE USER_DATA
- ),
- and so on for all the partitions
- PARTITION DEF (
- SUBPARTITION DEF_CUST1 TABLESPACE
USER_DATA,
18Creating DDLs
Static Part
create table tab1 ()
DDL to Create Table
partition y03q1 ( subpartition y03q1_cust1
tablespace )
Variable Part
19Constraints
- Constraints defined as
- DISABLE NOVALIDATE RELY
- ALTER TABLE ADD CONSTRAINT
- RELY DISABLE NOVALIDATE
20Constraint
- VALIDATE/NOVALIDATE
- Table TAB1 (Column STATUS)
- Current Values A, I, F
- Check Constraint STATUS IN (A,I)
- ENABLE/DISABLE
- New Value F
- RELY
21RELY
- Reasons
- To Include Relation Information to the Metadata
- To Enable Query Rewrite
22Summary Tab and View
On Source
On DW
23Casting
- SELECT
- CAST (CUST_NAME AS VARCHAR2(20))
- AS CUST_NAME
- FROM ltviewnamegt
CAST (column_name AS datatype (precision))
24Index of Temporary Table
cust1
Owned by Cust Schema
View
INDEX
Temporary Table
Filter Where CLAIM_DATE is in that quarter
TABLE
Summary Table
Massaging
For Customer Cust1 and Quarter Q1
Analyzing
25cust
Old Sub Partition
View
INDEX
Old Sub Partition
TABLE
ALTER TABLE EXCHANGE SUBPARTITION
subpartname WITH TEMPTABLE INCLUDING INDEXES
DW
26Temp Table
- CREATE TABLE T1_Y03Q1_CUST1
- TABLESPACE Y03Q1_CUST1_DATA
- PARALLEL 8 NOLOGGING
- AS
- SELECT
- FROM CUST1.VIEW1_at_DB1
- WHERE CLAIM_DATE gt
- add_months(trunc(to_date(03','RR'),'YYYY'),
- 3(to_number(1')-1))
- and batch_date lt
- last_day(add_months(trunc(
- to_date(03','RR'),'YYYY'),
- 3(to_number(1')) - 1 )) 1
27Script
- CREATE TABLE T1_YYY.QQ._CUST
- TABLESPACE YYY.QQ._CUST._DATA
- PARALLEL 8 NOLOGGING
- AS
- SELECT
- FROM CUST..VIEW1_at_DBLINK
- WHERE CLAIM_DATE gt
- ADD_MONTHS(TRUNC(TO_DATE('YY','RR'),'YYYY'),
- 3(TO_NUMBER('Q')-1))
- AND BATCH_DATE lt
- LAST_DAY(ADD_MONTHS(TRUNC(
- TO_DATE('YY','RR'),'YYYY'),
- 3(TO_NUMBER('Q')) -1 )) 1
28External Table
- Reason
- Source is a non-Oracle DB, e.g. DB2
- Source is External, no DB Link Allowed
- Fixed Format vs- Delimited
- Fixed Format
- Faster
- Easier
- More Space
- Delimited
- Less Space
- Slower
- Slightly More Complex
29Massaging
- Removing NOT NULL Constraints
- Making Datatypes Consistent
- The CAST operation converts NUMBER(m,n) to NUMBER
- cast(col1 as number(10,2)) as col1_m
- COL1 NUMBER(5,2)
- COL1_M NUMBER
30Analyzing
- Using DBMS_STATS.GATHER_TABLE_STATS
- PARALLEL Degree
- dbms_stats.gather_table_stats (
- ownname gt DWOWNER',
- tabname gt 'TABNAME',
- estimate_percent gt dbms_stats.auto_sample_siz
e, - method_opt gt 'FOR ALL INDEXED COLUMNS SIZE
AUTO', - degree gt dbms_stats.default_degree,
- cascade gt TRUE )
31Mat Views
- MVs Created as Tables
- CREATE TABLE MV_SUMMTAB1
- Storage clauses just like the underlying table
- CREATE MATERIALIZED VIEW
- MV_SUMMTAB1
- ON PREBUILT TABLE
- AS SELECT
- http//www.proligence.com/painless_alter.pdf
32Query Rewrite
- Table SUM_CLAIMS
- PROVIDER_ID, STATE, TYPE, TOT_AMT
- Table MV_SUM_CLAIMS
- PROVIDER_ID, STATE, SUM(TOT_AMT) TOT_AMT
- GROUP BY PROVIDER_ID, STATE
- SELECT SUM(TOT_AMT) FROM SUM_CLAIMS
- SELECT SUM(TOT_AMT) FROM MV_SUM_CLAIMS
33Query Rewrite
- Init.ora Parameters
- query_rewrite_enabled'TRUE'
- query_rewrite_integrity'STALE_TOLERATED
- ENFORCED Rewrite only if guaranteed
- TRUSTED Uses only if RELY
- STALE_TOLERATED Even if not RELY
34Checking QR
- dbms_mview.explain_rewrite (
- select cust_name, count() from summtab1 group
by cust_name ) - select message from rewrite_table
- QSM-01033 query rewritten with materialized
view, MV_SUMMTAB1 - QSM-01101 rollup(s) took place on mv, MV_SUMMTAB1
35Design
- MV_ subpartitions are on the same tablespace as
the parents. - Subparts of MV_SUMMTAB1_0? are in the same TS as
SUMMTAB1 - Subparts of MV_SUMMTAB2_0? in SUMMTAB2
36Quarter
Customer name
TableSpace1
TableSpace2
MV2
MV1
PARENT
37MV and Parents
- Partition Pruning
- Partition-wise Joins
- Partition Independence
38Adding Quarters/Customers
- Partition
- Default Partition VALUES LESS THAN (MAXVALUE)
- Subpartition
- Default Subpartition VALUES (DEFAULT)
39Qtr1
Qtr2
Qtr3
DEF
Cust1
Cust2
Cust3
DEF
40Qtr1
Qtr2
Qtr3
DEF
Cust1
Cust2
Cust3
DEF
41Qtr1
Qtr2
Qtr3
DEF
Cust1
Cust2
Cust3
Cust4
DEF
alter table split subpartition
42Qtr1
Qtr2
Qtr3
DEF
Cust1
Cust2
Cust3
DEF
43Qtr1
Qtr2
Qtr3
Qtr4
DEF
Cust1
Cust2
Cust3
DEF
alter table split partition
44Backup/Restore
- Backup
- ALTER TABLESPACE ltTSNamegt READ ONLY
- Copy the files to tape/CD.
- Restore
- Copy the file back into the directory
- ALTER TABLESPACE ltTSNamegt RECOVER
45Archival/Purge
Table4
SP4
Table4
Table4
46Archival/Purge
- CREATE TABLE S1_YltyygtQltqgt_ltCustNamegt
- TABLESPACE YltyygtQltqgt_ltCustNamegt_ltTSTypegt
- AS SELECT FROM SUMMTAB1 WHERE 12
- /
- CREATE INDEXES, CONSTRAINTS, etc.
- /
- ALTER TABLE SUMMTAB1 EXCHANGE
- SUBPARTITION YltyygtQltqgt_ltCustNamegt
- WITH TABLE YltyygtQltqgt_ltCustNamegt
- INCLUDING INDEXES
- /
47Check TTS
- ALTER TABLESPACE YltyygtQltqgt_ltCustNamegt_ltTSTypegt
READ ONLY - DBMS_TTS.TRANSPORT_SET_CHECK ( ltDataTSgt,ltIndexTSgt)
- SELECT FROM TRANSPORT_SET_VIOLATIONS
48Transport TS
- Export Parameter File
- TRANSPORT_TABLESPACEy
- TTS_FULLCHECKY
- FILEltFileLocationgt/expltTSgt.dmp
- TABLESPACES(ltDataTSgt, ltIndexTSgt)
- Copy the exp.dmp and Datafiles to tape/CD.
49Purge
- Drop Subpartition
- Drop the Tablespace
- DROP TABLESPACE ltTSNamegt INCLUDING CONTENTS AND
DATAFILES
50Restore
- ALTER TABLE SPLIT SUBPARTITION ltDefaultSPgt
- Copy Datafiles Export Dump Files from CD/Tape
- Import Parameter File
- TRANSPORT_TABLESPACESY
- TABLESPACES(ltDataTSgt,ltIndexTSgt)
- DATAFILES()
51Minimizing Refresh Unit
- Months instead of quarters refreshed at a time.
- Last Quarter Split into a Subpartition per Month
- Naming Convention
- YyyQqMmm
- Y03Q3M09
- Merge Subpartition
52Merging Subpartitions
- Index Subpartitions Created in Users Default
Tablespace - Subpartition Template
- ALTER TABLE SUMTAB1 ADD SUBPARTITION TEMPLATE
53Resumable Statement
- When?
- Running Large Report Jobs
- Creating Large Indexes
- ALTER SESSION ENABLE RESUMABLE NAME Job1
- View DBA_RESUMABLE
- NAME Name specified in ALTER SESSION
- COORD_SESSION_ID Coord Session in PQ
- SQL_TEXT The text of the SQL
- STATUS - RUNNING, SUSPENDED, ABORTED, ABORTING,
TIMEOUT - ERROR_NUMBER/ERROR_MSG
54Objectives Revisited
- To Minimize Downtime for Refreshes
- Incrementally Refresh
- Partitioning Techniques
- To Add Customers Easily
- To Add Quarters Easily
- To Archive Off and Purge Easily and Atomically
- To Restore Archives Quickly
55Oracle 10G
- Transportable Tablespaces Can Be Reinstated At a
Different Operating System - Can be used for Restoring to a Different OS
- Tablespaces Can Be Renamed
- Restoring Tablespace of the Same Name
- Multiple Temporary Tablespace
- For Large Index Creation, Sorting, etc.
56Oracle 10G contd.
- Partition Change Tracking Support for List
Partitioning - Query Rewrites Can Use Multiple MVs
- OEM Shows All Partitioning Features
- Hash Partitioned Global Indexes
- Data Pump
- Export/Import on Steroids
- Parallel Operation
57Oracle 10G contd.
- External Table Download
- A Utility to Create File from Table Data
- CREATE TABLE
- ORGANIZATION EXTERNAL
- AS SELECT FROM lta querygt
- Platform Independent File
- Can Be Used In External Tables
58Thank You!Please Submit ReviewsSession 36605
www.proligence.com