Oracle Change Data Capture Jack Raitto, Development Manager - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

Oracle Change Data Capture Jack Raitto, Development Manager

Description:

Oracle Change Data Capture Jack Raitto, Development Manager Oracle NEDC NYOUG Long Island SIG October 7, 2004 Capture your change data for FREE!* What is Oracle CDC? – PowerPoint PPT presentation

Number of Views:154
Avg rating:3.0/5.0
Slides: 32
Provided by: nyougOrgP
Learn more at: http://www.nyoug.org
Category:

less

Transcript and Presenter's Notes

Title: Oracle Change Data Capture Jack Raitto, Development Manager


1
Oracle Change Data Capture
  • Jack Raitto, Development Manager Oracle NEDC
  • NYOUG Long Island SIGOctober 7, 2004

2
Capture your change data for FREE!
Zero additional license cost over Oracle10g EE
Virtually zero source system processing cost
3
What is Oracle CDC?
  • Captures change data from operational system(s)
    as it occurs
  • Part of Extract / Transform / Load (ETL) process
    for DSS / Data warehouse, potentially other
    applications
  • Optimizes the extract phase
  • Unleashes SQL power for transformations
  • Provides management framework for change data

4
How was it done before (old way)?
5
CDC Advantages
  • Built in, custom fit, evolves with the database
  • Delivers change data when you need it, where you
    need it
  • Offers several tradeoffs between timely change
    delivery vs. source system overhead (sync, async
    hotlog, async autolog, etc.)
  • Assumes complete change management responsibility

6
CDC Advantages (concl.)
  • Captures all change data along with transaction
    information see all changes a given transaction
    made and who made them
  • Transactional consistency for changes across
    multiple source tables is guaranteed
  • Transparently coordinates sharing of change data
    across users and applications
  • You dont need rocket scientists on your staff!

7
CDC Configurations
8
How CDC Works Sync CDC
  • Uses internal triggers to capture before and/or
    after images of new and updated rows
  • Has the same performance implications as capture
    via user triggers
  • Delivers change data in real-time
  • Uses the same interface as async CDC

9
Synchronous CDC HotLog
Combined Source / Operational BI System
CDC Change Tables
ETL Process
Upsert to Load Dimension Tables
Customer
Triggers
Direct Path Insert to load Fact Tables
Order
CDC
10
How CDC Works Async CDC
  • Relational interface to Streams
  • Prepackaged Streams application
  • Asynchronously captures change data from
    redo/archive logs
  • Presents relational interface to change data
    stream
  • Can operate on source system (hot log) or staging
    system (auto log)

11
Foundations of Async CDC
Change capture Change management Warehouse loading
Async CDC
Replication Message queuing Warehouse
loading Event notification Data protection
Streams
Redo log inspection Debugging Auditing Reversing
transactions
LogMiner
12
Asynchronous CDC HotLog
Combined Source / Operational BI System
CDC Change Tables
ETL Process
Upsert to Load Dimension Tables
Customer
LogMiner
Active Redo Log
Streams
Direct Path Insert to load Fact Tables
Order
CDC
13
Asynchronous CDC AutoLog
Data Warehouse / Staging System
Source Database
CDC Change Tables
ETL Process
Upsert to Load Dimension Tables
Customer
LogMiner
Streams
Redo Logs
Direct Path Insert to load Fact Tables
Order
CDC
Arch Process
Archived Redo Logs
14
Using CDC Publish/Subscribe
  • Publisher supplies, subscribers consume change
    data
  • Model allows sharing of change data across users
    and applications
  • Coordinates retention / purge of change data
  • Prevents application from accidentally processing
    change data more than once
  • Guarantees transactional consistency of change
    data across source tables via change sets

15
Using CDC Publish/Subscribe
Subscriber 1
Subscription
Publisher
Change Data
Publication
Subscriber 2
Subscription
16
Publisher Concepts
  • Change source
  • Defines the source system to CDC
  • Change set
  • Collection of source tables for which
    transactionally consistent change data is needed
  • Change table
  • Container to receive change data
  • Is published to subscribers

17
Publisher Concepts
Source Database HQ
Staging Database DW
Change Source HQ_SRC


Source table sh.salesPROD_IDCUST_IDPROMO
_IDAMOUNT_SOLD QUANTITY_SOLD
Change Set SH_SET
Change table sales_ctPROD_IDCUST_IDPROMO
_IDAMOUNT_SOLD
Source table sh.promotionsPROMO_IDPROMO_S
UBCATPROMO_CAT PROMO_COST
Change table promo_ctPROMO_IDPROMO_SUBCAT
PROMO_CAT
18
Publish Package
DBMS_CDC_PUBLISH CREATE / ALTER /
DROP_AUTOLOG_CHANGE_SOURCE CREATE / ALTER /
DROP_CHANGE_SET CREATE / ALTER /
DROP_CHANGE_TABLE PURGE PURGE_CHANGE_SET
PURGE_CHANGE_TABLE DROP_SUBSCRIPTION
19
Using Change Data Subscribers
  • The subscriber creates a subscription from an
    available publication
  • The subscription provides a moving window (view)
    to the change data
  • Subscriptions go against a single change set and
    are therefore transactionally consistent
  • When all subscribers have advanced past old
    change data, CDC automatically and efficiently
    purges

20
Subscriber Concepts
Staging Database DW
Subscription sales_promo_list

Change Set SH_SET
Publication on sh.salesPROD_IDCUST_IDPROM
O_IDAMOUNT_SOLD
Subscriber view
spl_sales
Publication on sh.promotionsPROMO_IDPROMO
_SUBCATPROMO_CAT
Subscriber view
spl_promos
21
Subscriber View
Subscriber view spl_sales
Insert
Updatebefore
Updateafter
Insert
Insert
Delete
Insert
Insert
22
Subscriber Package
DBMS_CDC_SUBSCRIBE CREATE_SUBSCRIPTION
SUBSCRIBE ACTIVATE_SUBSCRIPTION
EXTEND_WINDOW PURGE_WINDOW
DROP_SUBSCRIPTION
23
Security
  • Sync publisher must have SELECT access to the
    source table
  • Async publisher must have EXECUTE_CATALOG_ROLE
    privilege
  • Publisher uses GRANT and REVOKE on change tables
    to control subscriber access

24
Performance Benchmark
  • Objectives
  • Determine impact on transaction time
  • Determine latency
  • Source system Oracle 10g R1 Beta, SunFire 4800
    SMP 8x900Mhz/16GB w/striped 8 x Sun StorEdge T3
    arrays (9X36.4MB each)
  • Customer insurance quote OLTP application run at
    Oracle, 250 concurrent users / 175 TPS, system
    warmed up (steady state)
  • Mixture of Inserts, Updates, Deletes, Singleton
    Selects, Cursor Fetches, Rollbacks / Commits,
    savepoints
  • Capture changes on all tables

Your mileage will vary!
25
Transaction Performance
Transaction elongated by 10 Relative impact
varies depending on other overhead
26
Transaction Performance
Transaction elongated by 8 Can reduce elongation
by adding RAC nodes / CPUs
27
Transaction Performance
Transaction elongation virtually
eliminated Change capture processing moved off
system
28
HotLog Latency Performance
About ½ the change data arrived in 1
second Virtually all the change data arrived in 2
seconds
29
Summary
  • CDC assumes the burden of change capture for you
  • Change data is guaranteed consistent and complete
  • Change data can be shared across users and
    applications effortlessly
  • CDC delivers change data where you need it, when
    you need it, and with minimal overhead

30
For More Information
  • Oracle Data Warehousing Guide, 10gR1, Chapter 16
  • Oracle PL/SQL Packages and Types Reference,
    10gR1, packages DBMS_CDC_
  • http//www.oracle.com/technology/oramag/oracle/03-
    nov/o63tech_bi.html
  • http//www.oracle.com/technology/products/bi/db/10
    g/pdf/twp_dss_ontime_etl_10gr1_0304.pdf
  • http//www.rittman.net/archives/000901.html
  • http//www.nyoug.org/cdc.pdf (Oracle9i)

31
Questions?
?
Write a Comment
User Comments (0)
About PowerShow.com