Title: PeerDirect Distributed Enterprise
1PeerDirect Distributed Enterprise
Reach Your Technical Peak The Final
Ascent Session IV Q4 2003
2PeerDirect Distributed Enterprise
Agenda
- Introduction to Database Replication
- Architecture Overview
- Configuration
- Replication Rule Design
- APIs and Event Callback
- Administration
3PeerDirect Distributed Enterprise
Agenda
- Introduction to Database Replication
- Architecture Overview
- Configuration
- Replication Rule Design
- APIs and Event Callback
- Administration
4Introduction to Database Replication
Why Database Replication?
- Back-up
- Fail-over
- Latency avoidance
- Load balancing
- Reporting
- Distributed environments
5Introduction to Database Replication
Define Distributed?
Remote Offices
Data Center
Remote Users
6Introduction to Database Replication
Goals of a Distributed Enterprise
- Latency avoidance
- Risk avoidance
- Application functionality
7Introduction to Database Replication
What if
Remote Offices
Data Center
Remote Users
8Introduction to Database Replication
PeerDirect Distributed Enterprise
- Breaks dependency on centralized application
architectures - Radically improves employee productivity with
decentralized applications - Disconnected use
- Creates an inherently resilient application
architecture - Eliminates the latency and availability problems
caused by the centralized model
9Introduction to Database Replication
Approaches
- Log based
- Database activity logged for periodic replay at
all locations - Queue based
- Middleware intercepts application to database
activity for periodic replay at all locations - Table based
- Captures changes, queries source and duplicates
values at all locations
10Introduction to Database Replication
Drawbacks of each approach
- Log based
- Log size
- Can not remain dormant long
- Periodically need to re-synch
- Inefficient use of bandwidth
- Typically only hub and spoke topology
- Queue or Message based
- All of the above
- Application must be modified
- Table based
- Not real-time
- Increases database size
11PeerDirect Distributed Enterprise
Agenda
- Introduction to Database Replication
- Architecture Overview
- Configuration
- Replication Rule Design
- APIs and Event Callback
- Administration
12Architecture Overview
The Basics
- Table based
- Vs Log or Queue based
- System tables for storing metadata
- Engine appears as another user
- Access via SQL and ODBC
13Architecture Overview
Features and benefits
Feature
Benefit
- Bi-directional update-everywhere model
- Read-write data between multiple databases
- Replicate, synchronize, and distribute corporate
data across multiple locations
14Architecture Overview
Features and benefits
Feature
Benefit
- Scheduled synchronization
- Net change compression
- Strong encryption
- Subset data using slices
- Maximize network efficiency, reduce costs
- Security
- Controlled access to data
15Architecture Overview
Features and benefits
Feature
Benefit
- Replicates between different database types
- MS SQL Server and Progress
- Oracle and PostreSQL
- Supported databases
- Progress
- Oracle
- SQL Server
- DB2
- Share data in mixed environments
- Consolidated view of corporate data
16Architecture Overview
Features and benefits
Feature
Benefit
- Multiple topologies supported
- Peer-to-peer
- Hub and spoke
- Load balance clustering
- Provide new options for building scalable systems
- Flexible configurations
17Architecture Overview
Features and benefits
Feature
Benefit
- Auto-discovery of nodes within the replication
network
- Improve quality of service and system
availability - Improved system administration
- Allows mobile workers who are disconnected or
have low-bandwidth limitations access to
enterprise applications
18Architecture Overview
Features and benefits
Feature
Benefit
- Database level configuration
- Adds needed tables to replicated database to
manage replication - Does not modify user-defined tables
- Existing application does not necessarily need to
be altered
19Architecture Overview
Patented replication technology
- Dynamic Data Slicing Architecture (DDSA)
- Table partitioning - schema
- Work set partitioning - query
- Column level partitioning - fragment
- Dynamic data migration
- Dynamic Bandwidth-Managed Partner Selection (DBP)
- Auto-discovery, auto-balanced
- Avoids overloading any one server
- Backbone, server and workstation algorithms
- Collision Avoidance Methodology (CAM)
- Record fragment management related columns
- Consistent resolution contracts
- Rich API for custom resolution
20Architecture Overview
Table-based
Application
Database
Data Table M Rows
PK
Control Tables 1 table per database table
Control Table M Rows
PK
System Tables 41 tables of setup data
21Architecture Overview
PeerDirect Distributed Enterprise
PeerDirect Distributed Enterprise
(PDDE)
ReplicationEngine(PDRE)
ReplicationDesigner(PDRD)
ReplicationAdministrator(PDRA)
22PeerDirect Distributed Enterprise
Agenda
- Introduction to Database Replication
- Architecture Overview
- Configuration
- Replication Rule Design
- APIs and Event Callback
- Administration
23Configuration
Replication Engine
- Communicates via SQL through ODBC
- The closer the Replication Engine is to the
database, the better the performance - Supports Win32 and Linux
- Configuration of InnerEdge and OuterEdge servers
affect the way in which sites will be replicated
24Configuration
PDRE setup
25Configuration
Site Types
- Site Types
- 3 Types
- Complete
- High Availability
- Low Availability
- Based upon
- Data set
- Connectivity
- Determines replication partner selection
26Configuration
Engine placement
- Considerations
- A Replication Engine should be on either end of
any WAN connection - PDRE traffic is compressed and encrypted whereas
ODBC traffic is not - On a LAN, consider fault tolerance and
performance vs. cost of hardware - Engine may reside on a different platform
- Win32 and Linux
27Configurations
Site types
- Dynamic partner selection algorithms allow for
load balancing and fault tolerance - Partner selection is affected by a sites type
- Available site types
- Low Availability
- High Availability
- Complete
28Configurations
Site types
- Low Availability
- Sites that are mostly or occasionally
disconnected - Compact
- High Availability
- Sites that are mostly connected but do not
contain all data - Spine sites
29Configuration
Simple Topologies Inner Edge
- High Availability sites
- All or most data slices
- Reliable connectivity
Data Center
30Configuration
Simple Topologies Outer Edge
- High or Low availability sites
- Pertinent data slices
- Questionable connectivity
Remote Office
31Configuration
Sample PDDE Topology
Remote Office
Remote Office
Data Center
32Configuration
Distributed Enterprise
- Aircraft Manufacturer
- Aircraft maintenance application
- Maintenance records were handled either on paper
or in different centralized databases - Maintenance issues cause commercial aircraft
delays and military readiness issues - Solution One PeerDirect InnerEdge Server and
multiple OuterEdge Workstations - Remote capabilities, work sets, and occasionally
connected users - Aircraft maintenance data captured while aircraft
is being maintained resulting in significant cost
savings
InnerEdge
33Configuration
Distributed Enterprise
- Medical Imaging
- Patient test results contain images and complex
data sets - Data only available at site where test performed
- Collaboration with doctors at other locations
nearly impossible - Solution One PeerDirect InnerEdge Server and
multiple OuterEdge Workstations - New test introduced into replication network
- Can be viewed by physician at any location at
anytime
InnerEdge
34PeerDirect Distributed Enterprise
Agenda
- Introduction to Database Replication
- Architecture Overview
- Configuration
- Replication Rule Design
- APIs and Event Callback
- Administration
35Replication Rule Design
Basic steps to defining rules
- Specify the application database
- Select the tables to replicate
- Define fragments to minimize the number of data
collisions - Subset data into work sets
- Arrange tables in transaction sets and/or groups
- Verify and save rules
- Prepare to activate replication-enabled database
36Replication Rule Design
Partial Sports2000 Schema
Analyze the schema
37Replication Rule Design
Generic replication
- Unit of replication is whole record
- Changes occur at multiple locations
- One user changes Addr
- One user changes Limit
- Can cause False Collisions
Name
Addr
City
Zip
Ctry
Pymt
Rating
Limit
?
?
38Replication Rule Design
Common solution to generic replication
- Unit of replication is each field
- Changes occur at multiple locations
- One user changes Addr and Limit
- One user changes Addr and Zip
- Can cause Silent Data Corruption
Name
Addr
City
Zip
Ctry
Pymt
Rating
Limit
Name
Addr
City
Zip
Ctry
Pymt
Rating
Limit
?
?
?
?
39Replication Rule Design
PeerDirect solution - fragments
- Unit of replication is a group of fields
- Changes occur at multiple locations
- One user changes Addr and Limit
- One user changes Addr and Zip
- Helps avoid False Collisions
- Optimizes the replication cycle
Name
Addr
City
Zip
Ctry
Pymt
Rating
Limit
?
?
?
?
40Replication Rule Design
Introduction to work sets
- PeerDirect allows you to define the business
rules for sub-setting around base tables
Ctry
Prod
Off
Off
Prod
Cust
Price
Price
Hist
Credit
Acct
ADetl
Trans
TDetl
41Replication Rule Design
Defining work sets
- Define the Ctry base table
- Subscribe the site to its country
42Replication Rule Design
Work sets
- Each site stores "all and only" the data it needs
- Tables belonging to work sets are replicated
based on a request, called a subscription - Nesting work sets allows for further sub-setting
of data
43Replication Rule Design
Introduction to transactions
- A set of insert, update or delete operations
- A single unit of work
- Must be completed as a whole
- Offer the ability to enforce data integrity
- Two types of transactions supported
- Transaction sets
- Transactions groups
44Replication Rule Design
Transaction sets
- Maintains transaction integrity of dependant
table during incomplete replication session - Updates all dependant records in a single unit of
work - Must include a base table
- Creation process similar to work sets
45PeerDirect Distributed Enterprise
Agenda
- Introduction to Database Replication
- Architecture Overview
- Configuration
- Replication Rule Design
- APIs and Event Callback
- Administration
46APIs and Event Callback
APIs
- Available for use by all third-party developers
- Available in both Unicode and ANSI
- 32-bit C-style function calls (standard calling
convention)
47APIs and Event Callback
API categories
- Errors and Logging
- Control
- Command
- Configuration
- Database Record Information
- Event Handling
- Miscellaneous
48APIs and Event Callback
Event handler callbacks
- Allows you to modify some behavior
- Receive various kinds of event notifications
- PDRE events include
- Replication monitoring
- Custom conflict resolution
- Enhanced logging
- Write and register custom DLLs
- Sample code
- C\Program Files\PeerDirect\pdre\Samples\EvntClbck
49PeerDirect Distributed Enterprise
Agenda
- Introduction to Database Replication
- Architecture Overview
- Configuration
- Replication Rule Design
- APIs and Event Callback
- Administration
50Administration
Remote protocol
- PDRE process listens for commands on port 2584
- Network security infrastructure must be
configured to pass TCP data on this port - ASCII protocol
- Provides interactive or programmatic interface
51Administration
Remote protocol example
52Administration
GUI Administrator
- Administers the replication network
- Maintenance of site information
- Subscribing and unsubscribing to and from slices
- Setting up replication schedules
- Allows for remote administration of all tasks