A MultiSource TimeVariant Datawarehouse Case Study - PowerPoint PPT Presentation

About This Presentation
Title:

A MultiSource TimeVariant Datawarehouse Case Study

Description:

A Real Life Case. Claims Datawarehouse. Several Customers ... Unix Shell Scripts. SQL*Plus. pr. ligence. Empowering Intelligence. Design. Varying Dimensions ... – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 59
Provided by: aru80
Category:

less

Transcript and Presenter's Notes

Title: A MultiSource TimeVariant Datawarehouse Case Study


1
AMulti-Source Time-Variant DatawarehouseCase
Study
Session 36605byArup NandaProligence,
Inc. Norwalk, CT
2
Objectives
  • Exploring DW Techniques in Oracle
  • Case Study
  • Oracle 10G Additions

3
A Real Life Case
  • Claims Datawarehouse
  • Several Customers/Sources
  • Several Quarters
  • Data Volume Was High
  • Irregular Frequency
  • Data Comes Often Late
  • Near Real Time Requirements

4
DB1
DB2
Cust1
Cust11
?
DB6
Cust10
Cust2
Cust9
DB3
Cust3
Cust8
Datawarehouse
Cust4
Cust7
DB4
DB5
Cust5
Cust6
5
Problem of Irregular Data
Detail Table
Summary Table
Detail Table
CUST2
6
Problems
  • Incoming Data Irregular
  • Summary Tables Need Refreshing
  • Quarters Added Continuously
  • Archival Requirements Vary Across Customers
  • Quick Retrieval of Archival Needed

7
Problems contd.
  • Summary on Summary Tables as Materialized Views
  • Need Refresh Whenever New Data Arrives
  • Or When Data is Purged/Reinstated
  • Customers Added and Deleted Frequently

8
Objective
  • To Minimize Downtime for Refreshes
  • Incrementally Refresh
  • Partitioning Techniques
  • To Add Customers Easily
  • To Add Quarters Easily
  • To Archive Off and Purge Easily and Atomically
  • To Restore Archives Quickly

9
Objective contd.
  • To have an ETL Setup for Easy Addition of Objects
    Such As Tables, Indexes, Mat Views.
  • Use Only Available Oracle and Unix Tools
  • PL/SQL
  • Unix Shell Scripts
  • SQLPlus

10
Design
  • Varying Dimensions
  • Customer
  • Quarter
  • Composite Partitioning
  • Range (for Quarters)
  • List (for Customers)
  • Local Indexes

11
Partitioning
  • Partitioned on CLAIM_DATE
  • RANGE
  • Partitioned named YyyQq
  • Storage Clauses Not Defined
  • Supartitioned on CUST_NAME
  • LIST
  • Named YyyQq_CustName, e.g. Y03Q3_CUST1

12
Indexing
  • All Indexes Local
  • CREATE INDEX IN_CLAIM_SUM_01
  • LOCAL
  • ON SUMTAB1 (COL1, COL2)
  • No Indexes UNIQUE and GLOBAL

13
Storage
  • Each Subpartition of Index or Table is kept in
    separate tablespaces named in the format
  • YltYeargtQltQtrgt_ltCustNamegt_DATA
  • e.g. Y02Q2_CUST1_DATA
  • Y02Q2_CUST2_DATA
  • Y03Q3_CUST1_DATA

14
Table
Cust3 Y03 Q3
Customers
Quarter
In Tablespace Y03Q3_CUST3_DATA
15
Tablespace
  • create tablespace y03q3_cust1_data
  • datafile /oradata/y03q3_cust1_data_01.dbf
  • size 500m
  • autoextend on next 500m
  • extent management local
  • segment space management auto

16
Table DDL
  • CREATE TABLE TAB1
  • ( )
  • PARTITION BY RANGE (CLAIM_DATE)
  • SUBPARTITION BY LIST (CUST_NAME)
  • (
  • PARTITION Y03Q1 VALUES LESS THAN
    (TO_DATE(2003/04/01,YYYY/MM/DD)),
  • (
  • SUBPARTITION Y03Q1_CUST1 VALUES (CUST1)
    TABLESPACE Y03Q1_CUST1_DATA,
  • SUBPARTITION Y03Q1_CUST2 VALUES (CUST2)
    TABLESPACE Y03Q1_CUST2_DATA,
  • and so on for all subpartitions
  • SUBPARTITION Y03Q1_DEF VALUES (DEFAULT)
    TABLESPACE USER_DATA
  • ),
  • PARTITION Y03Q2 VALUES LESS THAN
    (TO_DATE(2003/07/01,YYYY/MM/DD)),
  • (
  • SUBPARTITION Y03Q2_CUST1 VALUES (CUST1)
    TABLESPACE Y03Q2_CUST1_DATA,
  • SUBPARTITION Y03Q2_CUST2 VALUES (CUST2)
    TABLESPACE Y03Q2_CUST2_DATA,
  • and so on for all subpartitions
  • SUBPARTITION Y03Q2_DEF VALUES (DEFAULT)
    TABLESPACE USER_DATA
  • ),

17
Index DDL
  • CREATE INDEX IN_TAB1_01
  • ON TAB1 (COL1)
  • LOCAL NOLOGGING
  • (
  • PARTITION Y03Q1 (
  • SUBPARTITION Y03Q1_CUST1 TABLESPACE
    Y03Q1_CUST1_INDX,
  • SUBPARTITION Y03Q1_CUST2 TABLESPACE
    Y03Q1_CUST2_INDX,
  • and so on for all subpartitions
  • SUBPARTITION Y03Q1_DEF TABLESPACE USER_DATA
  • ),
  • PARTITION Y03Q2 (
  • SUBPARTITION Y03Q2_CUST1 TABLESPACE
    Y03Q2_CUST1_INDX,
  • SUBPARTITION Y03Q2_CUST2 TABLESPACE
    Y03Q2_CUST2_INDX,
  • and so on for all subpartitions
  • SUBPARTITION Y03Q2_DEF TABLESPACE USER_DATA
  • ),
  • and so on for all the partitions
  • PARTITION DEF (
  • SUBPARTITION DEF_CUST1 TABLESPACE
    USER_DATA,

18
Creating DDLs
Static Part
create table tab1 ()
DDL to Create Table
partition y03q1 ( subpartition y03q1_cust1
tablespace )
Variable Part
19
Constraints
  • Constraints defined as
  • DISABLE NOVALIDATE RELY
  • ALTER TABLE ADD CONSTRAINT
  • RELY DISABLE NOVALIDATE

20
Constraint
  • VALIDATE/NOVALIDATE
  • Table TAB1 (Column STATUS)
  • Current Values A, I, F
  • Check Constraint STATUS IN (A,I)
  • ENABLE/DISABLE
  • New Value F
  • RELY

21
RELY
  • Reasons
  • To Include Relation Information to the Metadata
  • To Enable Query Rewrite

22
Summary Tab and View
On Source
On DW
23
Casting
  • SELECT
  • CAST (CUST_NAME AS VARCHAR2(20))
  • AS CUST_NAME
  • FROM ltviewnamegt

CAST (column_name AS datatype (precision))
24
Index of Temporary Table
cust1
Owned by Cust Schema
View
INDEX
Temporary Table
Filter Where CLAIM_DATE is in that quarter
TABLE
Summary Table
Massaging
For Customer Cust1 and Quarter Q1
Analyzing
25
cust
Old Sub Partition
View
INDEX
Old Sub Partition
TABLE
ALTER TABLE EXCHANGE SUBPARTITION
subpartname WITH TEMPTABLE INCLUDING INDEXES
DW
26
Temp Table
  • CREATE TABLE T1_Y03Q1_CUST1
  • TABLESPACE Y03Q1_CUST1_DATA
  • PARALLEL 8 NOLOGGING
  • AS
  • SELECT
  • FROM CUST1.VIEW1_at_DB1
  • WHERE CLAIM_DATE gt
  • add_months(trunc(to_date(03','RR'),'YYYY'),
  • 3(to_number(1')-1))
  • and batch_date lt
  • last_day(add_months(trunc(
  • to_date(03','RR'),'YYYY'),
  • 3(to_number(1')) - 1 )) 1

27
Script
  • CREATE TABLE T1_YYY.QQ._CUST
  • TABLESPACE YYY.QQ._CUST._DATA
  • PARALLEL 8 NOLOGGING
  • AS
  • SELECT
  • FROM CUST..VIEW1_at_DBLINK
  • WHERE CLAIM_DATE gt
  • ADD_MONTHS(TRUNC(TO_DATE('YY','RR'),'YYYY'),
  • 3(TO_NUMBER('Q')-1))
  • AND BATCH_DATE lt
  • LAST_DAY(ADD_MONTHS(TRUNC(
  • TO_DATE('YY','RR'),'YYYY'),
  • 3(TO_NUMBER('Q')) -1 )) 1

28
External Table
  • Reason
  • Source is a non-Oracle DB, e.g. DB2
  • Source is External, no DB Link Allowed
  • Fixed Format vs- Delimited
  • Fixed Format
  • Faster
  • Easier
  • More Space
  • Delimited
  • Less Space
  • Slower
  • Slightly More Complex

29
Massaging
  • Removing NOT NULL Constraints
  • Making Datatypes Consistent
  • The CAST operation converts NUMBER(m,n) to NUMBER
  • cast(col1 as number(10,2)) as col1_m
  • COL1 NUMBER(5,2)
  • COL1_M NUMBER

30
Analyzing
  • Using DBMS_STATS.GATHER_TABLE_STATS
  • PARALLEL Degree
  • dbms_stats.gather_table_stats (
  • ownname gt DWOWNER',
  • tabname gt 'TABNAME',
  • estimate_percent gt dbms_stats.auto_sample_siz
    e,
  • method_opt gt 'FOR ALL INDEXED COLUMNS SIZE
    AUTO',
  • degree gt dbms_stats.default_degree,
  • cascade gt TRUE )

31
Mat Views
  • MVs Created as Tables
  • CREATE TABLE MV_SUMMTAB1
  • Storage clauses just like the underlying table
  • CREATE MATERIALIZED VIEW
  • MV_SUMMTAB1
  • ON PREBUILT TABLE
  • AS SELECT
  • http//www.proligence.com/painless_alter.pdf

32
Query Rewrite
  • Table SUM_CLAIMS
  • PROVIDER_ID, STATE, TYPE, TOT_AMT
  • Table MV_SUM_CLAIMS
  • PROVIDER_ID, STATE, SUM(TOT_AMT) TOT_AMT
  • GROUP BY PROVIDER_ID, STATE
  • SELECT SUM(TOT_AMT) FROM SUM_CLAIMS
  • SELECT SUM(TOT_AMT) FROM MV_SUM_CLAIMS

33
Query Rewrite
  • Init.ora Parameters
  • query_rewrite_enabled'TRUE'
  • query_rewrite_integrity'STALE_TOLERATED
  • ENFORCED Rewrite only if guaranteed
  • TRUSTED Uses only if RELY
  • STALE_TOLERATED Even if not RELY

34
Checking QR
  • dbms_mview.explain_rewrite (
  • select cust_name, count() from summtab1 group
    by cust_name )
  • select message from rewrite_table
  • QSM-01033 query rewritten with materialized
    view, MV_SUMMTAB1
  • QSM-01101 rollup(s) took place on mv, MV_SUMMTAB1

35
Design
  • MV_ subpartitions are on the same tablespace as
    the parents.
  • Subparts of MV_SUMMTAB1_0? are in the same TS as
    SUMMTAB1
  • Subparts of MV_SUMMTAB2_0? in SUMMTAB2

36
Quarter
Customer name
TableSpace1
TableSpace2
MV2
MV1
PARENT
37
MV and Parents
  • Partition Pruning
  • Partition-wise Joins
  • Partition Independence

38
Adding Quarters/Customers
  • Partition
  • Default Partition VALUES LESS THAN (MAXVALUE)
  • Subpartition
  • Default Subpartition VALUES (DEFAULT)

39
Qtr1
Qtr2
Qtr3
DEF
Cust1
Cust2
Cust3
DEF
40
Qtr1
Qtr2
Qtr3
DEF
Cust1
Cust2
Cust3
DEF
41
Qtr1
Qtr2
Qtr3
DEF
Cust1
Cust2
Cust3
Cust4
DEF
alter table split subpartition
42
Qtr1
Qtr2
Qtr3
DEF
Cust1
Cust2
Cust3
DEF
43
Qtr1
Qtr2
Qtr3
Qtr4
DEF
Cust1
Cust2
Cust3
DEF
alter table split partition
44
Backup/Restore
  • Backup
  • ALTER TABLESPACE ltTSNamegt READ ONLY
  • Copy the files to tape/CD.
  • Restore
  • Copy the file back into the directory
  • ALTER TABLESPACE ltTSNamegt RECOVER

45
Archival/Purge
Table4
SP4
Table4
Table4
46
Archival/Purge
  • CREATE TABLE S1_YltyygtQltqgt_ltCustNamegt
  • TABLESPACE YltyygtQltqgt_ltCustNamegt_ltTSTypegt
  • AS SELECT FROM SUMMTAB1 WHERE 12
  • /
  • CREATE INDEXES, CONSTRAINTS, etc.
  • /
  • ALTER TABLE SUMMTAB1 EXCHANGE
  • SUBPARTITION YltyygtQltqgt_ltCustNamegt
  • WITH TABLE YltyygtQltqgt_ltCustNamegt
  • INCLUDING INDEXES
  • /

47
Check TTS
  • ALTER TABLESPACE YltyygtQltqgt_ltCustNamegt_ltTSTypegt
    READ ONLY
  • DBMS_TTS.TRANSPORT_SET_CHECK ( ltDataTSgt,ltIndexTSgt)
  • SELECT FROM TRANSPORT_SET_VIOLATIONS

48
Transport TS
  • Export Parameter File
  • TRANSPORT_TABLESPACEy
  • TTS_FULLCHECKY
  • FILEltFileLocationgt/expltTSgt.dmp
  • TABLESPACES(ltDataTSgt, ltIndexTSgt)
  • Copy the exp.dmp and Datafiles to tape/CD.

49
Purge
  • Drop Subpartition
  • Drop the Tablespace
  • DROP TABLESPACE ltTSNamegt INCLUDING CONTENTS AND
    DATAFILES

50
Restore
  • ALTER TABLE SPLIT SUBPARTITION ltDefaultSPgt
  • Copy Datafiles Export Dump Files from CD/Tape
  • Import Parameter File
  • TRANSPORT_TABLESPACESY
  • TABLESPACES(ltDataTSgt,ltIndexTSgt)
  • DATAFILES()

51
Minimizing Refresh Unit
  • Months instead of quarters refreshed at a time.
  • Last Quarter Split into a Subpartition per Month
  • Naming Convention
  • YyyQqMmm
  • Y03Q3M09
  • Merge Subpartition

52
Merging Subpartitions
  • Index Subpartitions Created in Users Default
    Tablespace
  • Subpartition Template
  • ALTER TABLE SUMTAB1 ADD SUBPARTITION TEMPLATE

53
Resumable Statement
  • When?
  • Running Large Report Jobs
  • Creating Large Indexes
  • ALTER SESSION ENABLE RESUMABLE NAME Job1
  • View DBA_RESUMABLE
  • NAME Name specified in ALTER SESSION
  • COORD_SESSION_ID Coord Session in PQ
  • SQL_TEXT The text of the SQL
  • STATUS - RUNNING, SUSPENDED, ABORTED, ABORTING,
    TIMEOUT
  • ERROR_NUMBER/ERROR_MSG

54
Objectives Revisited
  • To Minimize Downtime for Refreshes
  • Incrementally Refresh
  • Partitioning Techniques
  • To Add Customers Easily
  • To Add Quarters Easily
  • To Archive Off and Purge Easily and Atomically
  • To Restore Archives Quickly

55
Oracle 10G
  • Transportable Tablespaces Can Be Reinstated At a
    Different Operating System
  • Can be used for Restoring to a Different OS
  • Tablespaces Can Be Renamed
  • Restoring Tablespace of the Same Name
  • Multiple Temporary Tablespace
  • For Large Index Creation, Sorting, etc.

56
Oracle 10G contd.
  • Partition Change Tracking Support for List
    Partitioning
  • Query Rewrites Can Use Multiple MVs
  • OEM Shows All Partitioning Features
  • Hash Partitioned Global Indexes
  • Data Pump
  • Export/Import on Steroids
  • Parallel Operation

57
Oracle 10G contd.
  • External Table Download
  • A Utility to Create File from Table Data
  • CREATE TABLE
  • ORGANIZATION EXTERNAL
  • AS SELECT FROM lta querygt
  • Platform Independent File
  • Can Be Used In External Tables

58
Thank You!Please Submit ReviewsSession 36605
www.proligence.com
Write a Comment
User Comments (0)
About PowerShow.com