Designing%20a%20Data%20Exchange%20-%20Best%20Practices - PowerPoint PPT Presentation

About This Presentation
Title:

Designing%20a%20Data%20Exchange%20-%20Best%20Practices

Description:

Ideal for both ad hoc and planned exchanges. Onus is on requestor to initiate exchange ... Data Provider can schedule processing of request ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 27
Provided by: hul6
Category:

less

Transcript and Presenter's Notes

Title: Designing%20a%20Data%20Exchange%20-%20Best%20Practices


1
Designing a Data Exchange - Best Practices
  • Data Exchange Scenarios
  • Sender vs. Receiver-initiated exchanges
  • Node Design
  • Best Practices
  • Handling Large Transactions
  • State Management
  • Data Services
  • Data Validation
  • Schema Design

2
Data Exchange Scenarios
3
Requesting Data (1 of 3)
  • Simple Query
  • Synchronous process
  • Ideal for small data sets
  • Ideal for both ad hoc and planned exchanges
  • Onus is on requestor to initiate exchange

4
Requesting Data (2 of 3)
  • Solicit with Download
  • Asynchronous process
  • Good for larger datasets
  • Data Provider can schedule processing of request
  • Requester can use GetStatus to see if data is
    ready yet

5
Requesting Data (3 of 3)
  • Solicit with Submit
  • Asynchronous process
  • Good for larger datasets
  • Does not require the requestor to continuously
    poll the data provider to see if data is ready

6
Sending Data (1 of 2)
  • Simple Submit
  • Very simple and very common process
  • Typical for traditional regulatory flows
  • Hides data since is not exposed as a service

7
Sending Data (2 of 2)
  • Notify with Download
  • Asynchronous approach to Simple Submit
  • Receiver can perform download at the time of
    their own choosing

8
Data Exchange Scenarios
  • Nodes wait for requests
  • Nodes may initiate actions (i.e. Submit)
  • How can a node do both?

9
Node Components
Example Node Architecture
10
Node Components
  • Node can be divided into components, each playing
    a different role
  • The Web Services Interface
  • Acts as a listener for inbound requests and
    submissions
  • Hosted on a Web Server (i.e. IIS, WebSphere)
  • Should not do any heavy lifting (i.e. data
    processing)

11
Node Components (continued)
  • Request Processor
  • Performs all data processing
  • Composes XML files for outbound delivery
  • Decomposes and processes inbound XML files
  • Coupled with a scheduler component
  • Enables node to process Solicit requests at a
    time of the node administrators choosing
  • Automatically kick off outbound processes (i.e.
    daily Submit)
  • Flow agnostic
  • Decoupled from specific flow implementations
  • Ideally installed on an Application Server

12
Node Components (continued)
  • Node Administration Utility
  • Create and manage local accounts
  • Install new data exchange components
  • Set processing schedules
  • Audit Node activity
  • Extract documents (inbound and outbound should be
    stored)

13
Node Components (continued)
  • Flow-specific components
  • Discrete components tailored for a specific data
    exchange
  • Hot-swappable
  • Services (interface) is generic
  • Node configuration determines which services are
    internal or public
  • Node configuration determines whether a given
    service is for Query or Solicit

14
Node Components (continued)
Flow-to-Node Interface
15
Large Transactions
  • Can cause problems in several areas
  • Data retrieval (SQL)
  • XML serialization (sender side)
  • Transmission over Internet
  • XML deserialization (receiver side)
  • Schema validation (both sender and receiver)

16
Large Transactions
  • Stage data in a model similar to that which is
    used by the schema
  • XML is hierarchal whereas RDBMS is relational
  • More secure
  • source system unaffected by node operations
  • Index query parameter fields

(SQL)
17
Large Transactions (continued)
  • Use an asynchronous exchange
  • Use Solicit, not Query
  • Schema design considerations
  • Schema KEY/KEYREF discouraged
  • Element naming may significantly affect file size
  • ltMailingAddressStateUSPSCodegtORlt/MailingAddressSt
    ateUSPSCodegt
  • Query costing
  • Calculate the size of a given result set (i.e.
    COUNT()) before running full query.
  • Not very much experience in this area

18
Large Transactions (continued)
  • A well-designed flow can help avoid large
    transactions
  • List services can return only high-level data
  • Scenario 1
  • RCRA.GetFacilities(WA)
  • Scenario 2
  • RCRA.GetFacilityList(WA)
  • RCRA.GetFacilityDetail(WA,FACID1234)
  • Data service parameters can be used to limit
    transaction size
  • Scenario 3
  • RCRA.GetFacilitiesByType(WA,LQG)
  • All options affect schema design

19
Large Transactions (continued)
  • File compression
  • zipping files can reduce file size by over 90
  • Compact storage (archiving)
  • Significant reduction in time to transmit
  • Disk I/O versus memory I/O
  • If possible, avoid using techniques which require
    system to read entire document into memory in
    order to process. Toughie

20
State Management
  • State Management is required any time two systems
    must be synchronized
  • Contrast to Data Publishing exchange
  • Typically the senders burden, but does not have
    to be
  • Partial rejects compound the difficulty

21
State Management (continued)
  • Flagging source data
  • Set submission status indicator on source data
  • Complexity is directly related to transaction
    granularity
  • Compounded if record-level rejects are performed

22
State Management (continued)
  • Exchange Network Header
  • Same schema can be used to perform different
    transactions
  • Can remove the need for TransactionCode (i.e.
    INSERT, UPDATE, DELETE) in schema
  • Delta to derive data changes since last submit
  • Many systems do not store deleted data
  • Compare last submission snapshot with current
    snapshot, derive what has changed
  • Incremental and full refresh services
  • i.e. Facility Flow

23
Data Service Best Practices
  • Data service naming conventions
  • Prefix.ActionObjectByParameter(s)
  • i.e. FacID.GetFacilityByName
  • Work in Progress
  • What about versioning?

24
Data Services Best Practices
  • Documenting data services
  • Data Service name
  • Whether the service is supported by Query,
    Solicit, or both
  • Parameters
  • Parameter Name
  • Index (order)
  • Required/Optional
  • Minimum/Maximum allowed values
  • Data type (string, integer, Boolean, Date)
  • Whether multiple values can be supplied to the
    parameter
  • Whether wildcard searches are supported and
    default wildcard behavior
  • Special formatting considerations
  • Access/Security settings
  • Return schema
  • Special fault conditions
  • Wildcards
  • Parameter delimiter (pipe character)

25
Data Validation Best Practices
  • XML instance files should be validated against
    the schema by the sender before submittal
  • CDX offering pre-submittal validation services
    for some flows
  • Schematron (Doug Timms)

26
Schema Design Best Practices
  • DRC 1.0 and DRC 1.1
  • Schema Namespace
  • Schema Versioning
  • Exchange Network Schema Types
  • Use the Shared Schema Components
Write a Comment
User Comments (0)
About PowerShow.com