August 11, 2003 - PowerPoint PPT Presentation

About This Presentation
Title:

August 11, 2003

Description:

Cycle through snapshots and build / modify. sub-warehouse tables ... Dynamically build Alter table and Create Index Statements. Same logging techniques built-in ... – PowerPoint PPT presentation

Number of Views:181
Avg rating:3.0/5.0
Slides: 50
Provided by: HeatherLD
Category:
Tags: august | build

less

Transcript and Presenter's Notes

Title: August 11, 2003


1
31 Schema Flavors Build a Warehouse Session
37018 OracleWorld San Francisco 2003 Robert
Corfman Boeing Commercial Airplanes Information
Systems
August 11, 2003
2
What is this Session?
  • Background on the Boeing Tool Management Project
  • COTS Package Enhancements
  • COTS Limitations
  • Warehouse Limitations / Overview Design
  • Specific Code Examples / Techniques
  • Replication Usage
  • Detecting Schema Differences
  • Error Capture, Propagation, Reporting
  • Performance Issues

3
Presentation Perspective
  • Application Architect
  • Responsible for technical application decisions
  • Enterprise project and Puget Sound
    responsibilities
  • Programmer
  • Multiple technical facets
  • COTS modification specs, delivery, testing,
    deployment
  • Radio Frequency Implementation
  • Data Conversion
  • Legacy application interfaces
  • Warehouse

Not a Warehousing Specialist . . .
4
31 Schema Flavors
  • Project Background
  • And
  • Warehouse Design Decisions

How did we get where we are?
5
Boeing Tool Standardization
  • Perishable / Portable Tools and Shop Supplies
  • Drill Motors, Safety Glasses, Reamers
    Non-Production
  • Retirement of multiple legacy applications
  • Mergers and acquisitions
  • Robust home-grown applications
  • Enterprise Commodity Standardization
  • Internal customer base already communicating
  • Enterprise application
  • Loose federation of independent site projects

6
Site Distribution
7
COTS Package
  • Site vs. Enterprise daily use vs. high leverage
  • Picked best identified product forsite-level
    usability
  • No Enterprise features in selected product
  • Cross-site visibility, item linkage, or transfers
  • Supplier to build-in needed features
  • No customizations

Be Nimble, Keep the Product Standard
8
Data Warehouse Design Decisions
  • Dont break the site-level features/usability
  • Main reason for selecting this product
  • Integrate warehouse features into the product
  • Allow continued site operations when warehouse is
    unavailable
  • Drove decision not to integrate all sites into a
    single Corporate-level database
  • Create separate roll-up database of all
    participating sites

Site Operations ?? Company Asset
9
What Data Goes in the Warehouse?
  • Data Subsets? Data Summarizations?
  • How do we decide what data to store? Summarize?
  • Volatile decision, different for different
    Companies
  • Drove decision to replicate all data into the
    warehouse and not pre-define summarizations
  • Application runs on multiple databases
  • Expected warehouse structure defined by COTS
  • Decision on exactly how to get data into designed
    warehouse left outside of COTS package
  • Additional tables and summaries could be
    added,but without direct integration in COTS
    package

Be Nimble, Keep the Product Standard
10
Application Look and Feel
Site Data View
11
Application Look and Feel
Warehouse Data View
12
Application Look and Feel
Inventory Reports View
13
Warehouse Design Considerations
  • 7 x 24 Operations
  • Allowed to be non-real-time
  • Includes Oracle and SQL Server sites
  • COTS package allows customization of site
  • User-defined fields in standard tables
  • Modification of field lengths from standard
  • Warehouse needs to handlegreatest common
    denominator
  • Desire to leverage replication

14
Warehouse Design Options
  • Updateable Snapshots
  • Warehouse is the Master, each site an updatable
    snapshot
  • Doesnt work if sites have schema variants
  • Master control of data no longer at the sites
  • Replication directly into Warehouse
  • Same schema variant problem
  • Multiple replicants
  • Custom procedures to consolidate into Master
    warehouse
  • Allows Replication with Master control at site
  • Accommodates variation in site schemas

ETL software now considered a possibility . . .
15
Proposed Warehouse Design
Warehouse Site
Company Data Warehouse
Site 1 (Unix)
Site 2 (NT)
Site 3 (NT)
Site N
Oracle
Oracle
SQL Server
Site Data
16
Initial Issues with Design
  • Could perform Roll-up via Views
  • Create or replace view as
  • Select 1 siteno,item,desc,null,null from
    site1.table
  • Union Select 2,item,desc,cust1,null from
    site2.table
  • Union Select 3,item,desc,null,cust2 from
    site3.table
  • Dynamically shows replicated data changes
  • Significant difference in query times
  • Oracle 9.2 showed an Order of magnitude
    difference in query times
  • Oracle 8.1.7 was significantly worse
  • this was the version we were using when the
    decision was made not to use views

17
Initial Issues with Design
  • Strong desire to have 7x24 data availability
  • Complexity of rebuilding in place
  • Difficult to update existing data . . .
  • Leverage transparency of synonyms
  • Maintain two copies of warehouse
  • Rebuild, then re-direct
  • If rebuild fails, dont re-direct
  • Built-in contingency . . .

Be Prepared for Success . . . (or Failure)
18
Final Warehouse Design
Warehouse Site
Company Data Warehouse
Warehouse Data 1
Warehouse Data 2
Site 1 (Unix)
Site 2 (NT)
Site 3 (NT)
Site N
Oracle
Oracle
SQL Server
Site Data
19
Replication of Oracle Data
  • Standard Oracle fast refresh functionality
  • Used to move data to the master warehouse site
  • Flexible schedule, avoid replication during
    rollup
  • Dbms_snapshot.refresh vs. Refresh Groups
  • Groups provided cross-table read consistency,
    required single commit
  • Perceived difficulties in schedule variations
  • Difficulty in automating snapshot validity prior
    to refresh
  • Custom column additions/deletions, column size
    changes
  • Snapshot refresh chosen, acceptance of potential
    for minor data inconsistencies
  • Managed replication via packages
  • One package set on snapshot-site, companion
    package on site
  • Designed validate and rebuild snapshot logs and
    snapshots as required

20
Oracle Snapshot Verification
  • Verify_snapshots_sql
  • 'select distinct upper(site.table_name)'
  • ' from user_tab_columns snap, '
  • ' user_tab_columns_at_' db_link ' site'
  • ' where upper(site.table_name)
    upper(snap.table_name())'
  • ' and upper(site.column_name)
    upper(snap.column_name())'
  • ' and upper(site.table_name) not like
    ''MLOG'''
  • ' and upper(site.table_name) not like
    ''RUPD'''
  • ' and upper(site.table_name) not in '
  • ' (select upper(tablename) from exclude_whse_at_'
    db_link ')'
  • ' and ('
  • ' snap.column_name is null'
  • ' or snap.data_type ! site.data_type'
  • ' or nvl(snap.data_precision,38)!nvl(site.data_p
    recision,38)'
  • -- default to 38 as replicated number
    have precision of 38...
  • ' or nvl(snap.data_length,-1) !
    nvl(site.data_length,-1)'
  • ' or nvl(snap.data_scale,-1) !
    nvl(site.data_scale,-1)'
  • ' )'

Never Apologize . . .
21
Replication of SQL Server Data
  • Expected to have full refresh
  • Site administrators has strong preference
    against
  • Apparent Options
  • Oracle Transparent Gateway
  • Two-part Replication
  • Microsoft Incremental replication to central LAN
  • Full replication to warehouse
  • Final Decision
  • Site researched push incremental replication from
    SQL Server to Oracle It works.
  • Site schema changes are manually handled

22
Build a Warehouse
  • More Code
  • For
  • Rolling the Schemas Together

How did we get there from here?
23
Automation with Configuration
  • Packaged functionality run with Oracle Jobs
  • Table-driven configuration
  • Which schemas are included
  • Snapshot and Warehouse Tables build parameters
  • with defaults and individual table overrides
  • Primary Keys and Indexes
  • Includes infrastructure features
  • Logging
  • Full error trapping
  • Continued operation after many failures
  • Statistical performance information
  • Positive response email results

Keep the DBAs Happy . . . What goes around . . .
24
Warehouse Build Steps
  • Drop sub-warehouse objects
  • Cycle through snapshots and build /
    modifysub-warehouse tables
  • Bulk insert each snapshot table into
    sub-warehouse
  • Create primary keys and indexes
  • Compute statistics
  • Switch master warehouse synonyms to point
    atnew sub-warehouse (upon successful
    completion)
  • Email rebuild results

25
Drop Sub-Warehouse Objects (Simple)
  • PROCEDURE DROP_WAREHOUSE_TABLES IS
  • cursor warehouse_tables is
  • select table_name from user_tables order by 1
  • mysql varchar2(255)
  • BEGIN
  • FOR tables_rec IN warehouse_tables LOOP
  • mysql 'DROP TABLE ' tables_rec.table_name
  • EXECUTE IMMEDIATE mysql
  • END LOOP
  • END DROP_WAREHOUSE_TABLES

26
Drop Warehouse Objects-Add Logging
  • PROCEDURE DROP_WAREHOUSE_TABLES IS
  • cursor warehouse_tables is
  • select table_name from user_tables order by 1
  • mysql varchar2(255)
  • me ERR_TYPE
  • BEGIN
  • Me.my_name 'DROP_WAREHOUSE_TABLES'
  • log_activity(me,'Drop Warehouse tables
    started','S')
  • FOR tables_rec IN warehouse_tables LOOP
  • log_activity(me,'Dropping '
    tables_rec.table_name)
  • mysql 'DROP TABLE ' tables_rec.table_nam
    e
  • EXECUTE IMMEDIATE mysql
  • END LOOP
  • log_activity(me,'Drop Warehouse tables
    completed','C')
  • END DROP_WAREHOUSE_TABLES

27
Drop Warehouse Objects-Add Error Traps
  • PROCEDURE DROP_WAREHOUSE_TABLES IS
  • cursor warehouse_tables is
  • select table_name from user_tables order by 1
  • mysql varchar2(255)
  • me ERR_TYPE
  • BEGIN
  • Me.my_name 'DROP_WAREHOUSE_TABLES'
  • log_activity(me,'Drop Warehouse tables
    started','S')
  • FOR tables_rec IN warehouse_tables LOOP
  • log_activity(me,'Dropping '
    tables_rec.table_name)
  • BEGIN
  • mysql 'DROP TABLE '
    tables_rec.table_name
  • EXECUTE IMMEDIATE mysql
  • EXCEPTION WHEN OTHERS THEN
  • INTERNAL_PROC_ERROR(me, SQLERRM, SQLCODE,
    mysql)
  • END
  • END LOOP
  • log_activity(me,'Drop Warehouse tables
    completed','C')
  • IF me.cnt_errors gt 0 THEN

28
Warehouse Build Steps
  • Drop sub-warehouse objects
  • Cycle through snapshots and build /
    modifysub-warehouse tables
  • Bulk insert each snapshot table into
    sub-warehouse
  • Create primary keys and indexes
  • Compute statistics
  • Switch master warehouse synonyms to point
    atnew sub-warehouse (upon successful
    completion)
  • Email rebuild results

29
Table Driven Configuration
. . . 6 TULSA 7 EVERETT 8 RENTON 9
AUBURN . . .
Warehouse_Sites SiteNo Schema_Name
  • Allows for Adding and Removing Sites as needed

PROCEDURE VERIFY_TABLES IS BEGIN Loop Through
Each Schema (With Error Trapping and Logging)
Verify_All_Schema_Tables(Schema_Name) End
Loop END VERIFY_TABLES
30
Build Warehouse Tables
  • Verify Schema Tables performs the major warehouse
    structure build-up work
  • 31 Flavors . . . more similar than different
  • First Schema builds majority of tables / columns
  • Follow-on schemas modify structure as needed
  • Greatest common denominator
  • Loops through all table columns usingdata
    dictionary information
  • Very similar to snapshot verification routines
  • Column names, data type, length, precision, scale

31
More Table Driven Configuration
Storage_Parms Table_Size Table_Storage
Index_Storage
Table_Sizes Table_Name Table_Size
STANDARD TABLESPACE WH128K_DATA STORAGE(INITIAL
128K NEXT 128K) NOLOGGING TABLESPACE WH128K_INDX
STORAGE(INITIAL 128K NEXT 128K)
NOLOGGING LARGE TABLESPACE WH4M_DATA
STORAGE(INITIAL 4M NEXT 4M) NOLOGGING TABLESPACE
WH4M_INDX STORAGE(INITIAL 4M NEXT 4M) NOLOGGING
Select ' ' Table_Storage into BUILD_PARMS
from Storage_Parms Where Table_Size
'LARGE' My_SQL Create_Table_Statement
BUILD_PARMS Execute Immediate My_SQL
Used when Adding a New Table
32
Warehouse Build Steps
  • Drop sub-warehouse objects
  • Cycle through snapshots and build /
    modifysub-warehouse tables
  • Bulk insert each snapshot table into
    sub-warehouse
  • Create primary keys and indexes
  • Compute statistics
  • Switch master warehouse synonyms to point
    atnew sub-warehouse (upon successful
    completion)
  • Email rebuild results

33
Insert Mechanics
  • Data Warehouse schema is the sum of every site
  • Similar technique to identifying schema
  • Loop through each schema and table
  • Dynamically build Insert Statement
  • Based on site columns
  • Include unique ID for each schema
  • Same logging techniques built-in
  • Includes actual SQL statement
  • Error trapping and continuation built-in

34
Minimizing Archive Logs with NoLogging
  • Archive logging in the production warehouse
    instance
  • Some configuration data should be archived
  • Normal Boeing Operation to have Archive Logs in
    Production
  • Warehouse data is recoverable from sites
  • Nologging specified on tablespaces and tables
  • Oracle still logs changes and inserts
  • Daily warehouse rebuild generates Significant
    Archive Log activity
  • Big Surprise with initial production use

Ignorance is Bliss ???
35
Using Bulk Inserts
  • Bulk load can be specified on inserts
  • Direct Load with sqlloader
  • Append hint with SQL
  • Only affects tables specified with NoLogging
  • Insert / APPEND / into warehouse.inventory
    (siteno, item, description)Select
    1,item,description from site1.inventory
  • Invalidates session table use till commit
  • ORA-12838cannot read/modify an object after
    modifying it in parallel

36
Warehouse Operations Log
Warehouse_Op_Log ID
Log_Date Proc Log_Type
Log_User Message Error
Log_Rows
1617893 08/08/03 020802 INSERT_SITE_TABLE_DATA
I CMPWH Inserting From WICHITA.STATION
0
0 1617894 08/08/03 020803 INSERT_SITE_TABLE_DAT
A I CMPWH Data Inserted from
WICHITA.STATION (5031 Rows) 0
5031
37
Warehouse Build Steps
  • Drop sub-warehouse objects
  • Cycle through snapshots and build /
    modifysub-warehouse tables
  • Bulk insert each snapshot table into
    sub-warehouse
  • Create primary keys and indexes
  • Compute statistics
  • Switch master warehouse synonyms to point
    atnew sub-warehouse (upon successful
    completion)
  • Email rebuild results

38
Indexing Mechanics
  • Table-driven primary key and index list
  • Uses table storage size information for
    tablespace and other storage clause information
  • Dynamically build Alter table and Create Index
    Statements
  • Same logging techniques built-in
  • Includes actual SQL statement
  • Error trapping and continuation built-in

39
Error Logging - Continuation
  • select id, log_date, proc, log_type, message
    from warehouse_op_log where id between 1576388
    and 1576390 order by id
  • 1576388 08/01/03 021211 CREATE_PK_CONSTRAINTS
    I
  • Create Primary Key STATIONHIST.PK_STATIONHIST(CRIB
    BIN,SITENO)
  • 1576389 08/01/03 021223 CREATE_PK_CONSTRAINTS
    E
  • ORA-02437 cannot validate (CMPWH1.PK_STATIONHIST)
    - primary key violated (alter table
    CMPWH1.STATIONHIST ADD CONSTRAINT PK_STATIONHIST
    PRIMARY KEY (CRIBBIN,SITENO) USING INDEX
    TABLESPACE WH128K_INDX STORAGE(INITIAL 128K NEXT
    128K) NOLOGGING)
  • 1576390 08/01/03 021223 CREATE_PK_CONSTRAINTS
    I
  • Create Primary Key TASK.PK_TASK(TASKNO,SITENO)

40
Error Logging - Propagation
  • 1570901 08/01/03 020002 REBUILD_WAREHOUSE
    S
  • REBUILDING WAREHOUSE - CMPWH1
  • 1576389 08/01/03 021223 CREATE_PK_CONSTRAINTS
    E
  • ORA-02437 cannot validate (CMPWH1.PK_STATIONHIST)
    - primary key violated (alter table
    CMPWH1.STATIONHIST ADD CONSTRAINT PK_STATIONHIST
    PRIMARY KEY (CRIBBIN,SITENO) USING INDEX
    TABLESPACE WH128K_INDX STORAGE(INITIAL 128K NEXT
    128K) NOLOGGING)
  • 1576418 08/01/03 021249 CREATE_INDEXES
    E
  • ORA-20000 1 Error Occurred in CREATE_PK_CONSTRAIN
    TS (see log for details). Last was ORA-02437
    cannot validate (CMPWH1.PK_STATIONHIST) - primary
    key violated
  • 1576471 08/01/03 022031 REBUILD_WAREHOUSE
    E
  • ORA-20000 1 Error Occurred in CREATE_INDEXES
    (see log for details). Last was ORA-20000 1
    Error Occurred in CREATE_PK_CONSTRAINTS (see log
    for details). Last was ORA-02437 cannot
    validate (CMPWH1.PK_STATIONHIST) - primary key
    violated
  • 1576728 08/01/03 024158 REBUILD_WAREHOUSE
    C
  • REBUILT WAREHOUSE - CMPWH1 (1 error encountered)

41
Warehouse Build Steps
  • Drop sub-warehouse objects
  • Cycle through snapshots and build /
    modifysub-warehouse tables
  • Bulk insert each snapshot table into
    sub-warehouse
  • Create primary keys and indexes
  • Compute statistics
  • Switch master warehouse synonyms to point
    atnew sub-warehouse (upon successful
    completion)
  • Email rebuild results

42
Warehouse Build Steps
  • Drop sub-warehouse objects
  • Cycle through snapshots and build /
    modifysub-warehouse tables
  • Bulk insert each snapshot table into
    sub-warehouse
  • Create primary keys and indexes
  • Compute statistics
  • Switch master warehouse synonyms to point
    atnew sub-warehouse (upon successful
    completion)
  • Email rebuild results

43
Positive Email Response
  • Email sent upon success and failure
  • Catastrophic failure would result in no email
  • Subject clearly states warehouse status
  • Table-driven recipient list
  • Uses dbms_smtp package
  • Build higher-level packaged functionality
  • Used for multiple purposes (such as interface
    failures)
  • Email includes statistics regarding
  • Last snapshot refresh times (system tables)
  • Roll-up timing statistics (warehouse operation
    log)
  • Warehouse table row counts (warehouse operation
    log)

44
Status Email Subject and Body
  • WARNING Snapshots NOT Healthy, but Warehouse
    Redirected - CMTWH1
  • SUCCESS Production Warehouse Redirected - CMPWH2
  • The CribMaster Oracle Data Warehouse was
    successfully rebuild and activated
  • Last Refresh of Participating Sites
  • MESA 07/31/03 014817
  • KELLY 07/30/03 233445
  • CECIL 07/30/03 231442
  • . . .
  • Attachment shows statistics

45
Email Statistics Attachment
  • Timing statistics for all major warehouse build
    steps
  • 8/11/2003 20006 REBUILDING WAREHOUSE - CMPWH2
  • 8/11/2003 20016 Dropping all tables in
    Warehouse started
  • 8/11/2003 20031 Verify Tables for all Schemas
    started
  • 8/11/2003 20211 Insert Table Data for all
    Schemas started
  • 8/11/2003 20802 Create all Primary key
    constraints started
  • 8/11/2003 21235 Create all indexes started
  • 8/11/2003 22012 Counting Rows in CMPWH2 started
  • 8/11/2003 22013 Analyze all tables in Warehouse
    started
  • 8/11/2003 24152 Starting with REVOKE-CMPWH2
  • 8/11/2003 24201 Completed REVOKE-CMPWH2

46
Email Statistics Attachment
  • Row count statistics for all warehouse tables
  • TOTAL TABLE ROWS 15288750
  • Row Counts for Significant Tables (rows over
    100,000)
  • 8/11/2003 SERIALSTATUSHISTORY 7235395
  • 8/11/2003 TRANS 2945618
  • 8/11/2003 STATIONHISTORY 669932
  • 8/11/2003 STATION 658313
  • 8/11/2003 ALTVENDOR 549365
  • 8/11/2003 ITEMSERIAL 546443
  • 8/11/2003 EMPLOYEE 486319
  • 8/11/2003 INVENTRY 359461
  • . . .

47
Failure Email
  • ERROR Production Warehouse Rebuild Failed -
    CMPWH2(All Snapshots ARE Healthy)
  • There was a problem rebuilding the warehouse.
    Cutover did NOT occur.Please review the
    warehouse_op_log for problems, correct them, and
    establish the correct new warehouse.Statement
    Executed was (BEGIN CMPWH2.WH_UTIL_MASTER.REBUILD_
    WAREHOUSE('CMPWH2') END)Error was (ORA-20000
    2 Error(s) Occurred in REBUILD_WAREHOUSE (see log
    for details).Last was ORA-20000 6 Error(s)
    Occurred in INSERT_ALL_DATA (see log for
    details).Last was ORA-20000 105 Error(s)
    Occurred in INSERT_SCHEMA_DATA (see log for
    details).Last was ORA-20000 1 Error(s)
    Occurred in INSERT_SITE_TABLE_DATA (see log for
    details).Last was ORA-00942 table or view does
    not exist)

48
What to Take Away
  • Work with COTS Supplier, Avoid Customizations
  • Data needs in a warehouse
  • Dont Over-Analyze, disk is cheap (now)
  • Plan in failure contingency in Build / Rebuild
    process
  • Table-drive the implementation
  • Decisions and Structures change with time
  • Provide Visibility of the Activity
  • Ongoing logging and positive results notification
  • Bulk Insert
  • Communications key
  • DBA, Developers, COTS Supplier, and Users

49
Forever New Frontiers
Robert Corfman TSMS System Architect Robert.a.corf
man_at_Boeing.com Please turn in your evaluation
form
Write a Comment
User Comments (0)
About PowerShow.com