Title: Change Data Capture and Integrating NonMicrosoft Platforms with SSISSQL Server
1Change Data Capture and Integrating Non-Microsoft
Platforms with SSIS/SQL Server
Kevin Maguire Attunity kevin.maguire_at_attunity.co
m
2Agenda
- Introduction
- Integrating SQL Server with Non-Microsoft
Platforms - SSIS Components - High-Speed Connectors for
Teradata Oracle - Attunity Connectors
- Loosely Coupled Integration
- Scalable
- Recoverability and Fault Tolerance
- Change Data Capture
- Requirements and Methodologies
- Attunity Approach
- Software Demo
- Example of Attunity integrated with SSIS Studio
(ETL) - Example of Attunity High-Speed Connectors for
SSIS/SQL Server - QA
3Company Snapshot
- Integration Technology Company
- Founded 1988
- HQ in Burlington, MA
- Operations in US, Europe and Asia
Corporate
- Attunity Integration Suite
- Easy to use data application access/integration
software - Real-time Integration Change Data Capture (CDC)
- Attunity Services
- Consulting, Integration, Training Support
Products/ Services
Customers / Partners
- 1000s businesses worldwide
- Blue-chip OEM, reseller and influencing partners
4The Choice of Industry Leaders
Strong OEM partnerships with 4 of the largest
software companies in the world
4
5(No Transcript)
6Total Enterprise Connectivity
Web Applications
BI/Dashboards
Data Warehouse ODS
Enterprise Applications
7Attunity Integration Suite
Attunity Integration Suite
BI, Reporting
Windows
Data access, Adapters
Attunity Connect
Data Integration
UNIX
data
Portals
OpenVMS
Change data capture
Attunity Stream
iSeries
SOA, Web
apps
Attunity Federate
Virtual data federation
HP NonStop
Migration
z/OS
Modernization
8Attunity Studio Integrated GUI
- Server Configuration
- Metadata Management
- Runtime Management
- Remote Control
-
- Standard (eclipse)
- Consistent
- Integrated
- Drag-n-Drop
- Query Builder
- Offline mode
9Attunity Connect Data Application Connectivity
Attunity Connect Data Connectivity
Windows
UNIX
OpenVMS
iSeries
HP NonStop
z/OS
10Virtual Data Federation
Sales
Orders
Customers
Inventory
Federation Engine
Intelligent Query Optimization
10
11Single Change Capture Multiple Targets
Applications/ Business Processes
EAI MS BizTalk/BPEL
Data Warehouse
Staging Tables Changes
Replicated Tables
SSIS - ETL
11
12Integrating SQL Server with Non-Microsoft
Platforms
13Integrating SQL Server with Non-Microsoft
Platforms
- SSIS Components
- High-Speed Connectors for Teradata Oracle
- Attunity Connectors
- Mainframe
- DB2, VSAM, IMS, Adabas
- iSeries-AS/400
- DB2
- HP/Tandem NonStop
- Enscribe, SQL/MP
- HP Alpha/Integrity OpenVMS
- RMS
- Unix/Linux
- DB2
- Attunity Stream - CDC
14Oracle Connector by Attunity Architecture
Oracle
SSIS
DB
OCI
Oracle Client
15Oracle Connector by Attunity Supported plarforms
versions
9.2.0.4 and higher
SQL Server 2008 Enterprise/Developer
Oracle
SSIS
DB
Oracle Client
Any platform
Windows XP x86, x64 Windows Vista x86,
x64 Windows Server 2003 x86, x64, IA64 Windows
Server 2008 x86, x64, IA64
16Oracle Connector by Attunity High Performance
Architecture
- Oracle integration (using OCI)
- Bulk extract using OCI Array Binding
- Bulk load using OCI Direct Path (fastest,
constraints must turned off) FAST LOAD - Bulk load using
- OCI Array Binding
- SSIS integration
- Direct integration into internal buffering APIs,
cutting through .NET and other layers
17Teradata Connector by Attunity Architecture
18Teradata Connector by Attunity Supported
plarforms versions
SQL Server 2008 Enterprise/Developer
6.0 and higher
Teradata
SSIS
Teradata ODBC
DB
Teradata TPT 12
Any platform
Windows XP x86, x64 Windows Vista x86,
x64 Windows Server 2003 x86, x64 Windows Server
2008 x86, x64
19Teradata Connector by Attunity High Performance
Architecture
- Teradata integration (using TPT 12)
- Bulk extract using TPT FastExport
- Bulk load using TPT FastLoad
- Incremental load using TPT TPUMP
- SSIS integration
- Direct integration into internal buffering APIs,
cutting through .NET and other layers
20Loosely Coupled Integration
21Loosely Coupled Integration
- Definition
- Loose coupling describes an approach where
integration interfaces are developed with minimal
assumptions between the sending/receiving
parties, thus reducing the risk that a change in
one application/module will force a change in
another application/module.
22Loosely Coupled Integration
- Loose coupling has multiple dimensions.
Integration between two applications may be
loosely coupled in time using Message-oriented
middleware, meaning the availability of one
system does not affect the other. - Alternatively, integration may be loosely coupled
in format using middleware to perform Data
transformation, meaning differences in data
models do not prevent integration.
23Loosely Coupled Integration
- Scalable
- Data and functionality can be reused with new
applications - Data and functionality can be distributed across
computing resources to increase scale - Fault Tolerance and Recoverability
- Failure of a resource doesnt bring down other
integrated components - System resumes operations once resources come
back online
24Real-time Change Data Capture
25What is CDC?
Change Data Capture (CDC) is an innovative
integration technology, which is based on the
identification, capture, and delivery of only the
changes made to enterprise data sources. CDC is
changing data integration, making it efficient
and real-time.
26Traditional ETL vs. CDC-ETL
load
extract
transform
Operational Data Sources
DW
ETL Engine
26
27Why do we need CDC ?
- Improve decision making with information
delivered in real-time - Shrinking batch windows
- Handle increased data volumes
- Cost reduction of infrastructure
- Reduce impact of system failures
28Why do we need CDC ?
- Saves Money!!!
- No need to buy increased computing power to
increase update speed - Environmentally Friendly!!
- Think of all the unnecessary CPU cycles moving
data that has already been moved every single day - Less CPU cycles less electricity
29CDC Approaches
- Use timestamps in extracts and file dumps
- Updating the systems is expensive
- Extract and compare current data to its history
- Programming intensive, highly inefficient, not
scalable - Use custom-built triggers
- Add overhead to online transactions
- Programming intensive
- Complex to handle recovery and guaranteed change
delivery
30CDC Design Goals
- Non-intrusive
- Source code
- Performance
- Database
- Capture latency what is real-time
- Multiple data sources
- Including non-relational
- Open architecture
- Loosely Coupled
31CDC with Log Files
- Reading database log files
- Committed and Uncommitted changes
- Transaction Demarcation
- Redundant Changes
- Log-based CDC is the most successful technique
- Least processing load on OLTP systems
- Doesnt require alteration of OLTP data sources
- Leverages native change capture logging of source
system
32Types of Changes Captured
- Records
- Capture changes at the record/field level
- Each change is a unique event
- Used for updating Data Stores and Data Warehouse
- Transactions
- Capture changes at the record/field level
- Group the changes as a transaction
- Used for Alerts, Workflow and Application
Synchronization
33Attunity Stream Real-time Change Data Capture
34(No Transcript)
35Fault Tolerance and Recovery
- Decoupled environment
- ETL/EAI Consumers Processes can continue
processing changes if connection to agent is
down. - Broken communication with the agent results in
periodic reconnection retries. Stream position is
re-established when reconnected.
35
36Fault Tolerance and Recovery
- Change router maintains position of last event
read from the agent to recover from disconnects - Change router maintains oldest event of oldest
active transaction to recover from router
failures - Change tables keep original unique IDs of changes
to support consistent recovery without duplicates
37Attunity CDC Architecture Summary
- Performance
- High throughput (thousands records/second at
customer trials) - Concurrent consumers
- Low Overhead
- Log-based CDC
- Staging area off the source platform
- Scalability linear, no limitations
- Many sources with common architecture and tools
- Open Architecture (not bound to replication)
- Changes available as database records
- Programmatic provides SQL and XML interfaces
- Fits enterprise architectures and technologies
38SSIS CDC Accelerator Components
- Attunity CDC Configuration Wizard
- Define sources and target tables
- Configure required CDC Agents
- Configure change routing and staging rules
- CDC Package Builder Wizard
- Map source to target
- Assign tables into CDC processing packages
- Automatically generates/updates CDC processing
packages - CDC Management Tables
- CDC processing coordination and fault-resilience
- Referential integrity
38
38
39Software DemoExample of Stream integrated with
SSIS Studio (ETL)
40Example of Attunity integrated with SSIS Studio
(ETL)
- Using SSIS and High-Speed Oracle Connector from
Attunity - Writing changes to an Oracle Database
- Using Attunity Stream (CDC) to capture changes
from Oracle to update a SQL Server Data Store in
Real-time
41Demo kevin.maguire_at_attunity.com
SSIS
SQL Server 2008
Using Attunity Stream Captured Changes written
into SQL Server
Oracle
Using Attunity High-Speed Connector for SQL
Server-Oracle Writing Changes to Oracle
42The Attunity Difference
- Minimal intrusion CDC (no triggers)
- Support bulk and change-based data extraction
- Support many legacy data sources
- Proven Partnerships with Leading ETL Tools
- Built in support for SSIS
- Independent, un-bundled. Small footprint
42
43Summary - Why Attunity Stream?
- Improve decision making with information
delivered in real-time - Shrinking batch windows
- Handle increased data volumes
- Cost reduction of infrastructure
- Reduce impact of system failures