The Top 10 Reasons Why Federated Cant Succeed - PowerPoint PPT Presentation

About This Presentation
Title:

The Top 10 Reasons Why Federated Cant Succeed

Description:

Application APIs must be reckoned with. ACIDity isn't always achievable ... Not always APIs. Tuning is difficult. Need to understand what must change ... – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 19
Provided by: IBMU422
Category:

less

Transcript and Presenter's Notes

Title: The Top 10 Reasons Why Federated Cant Succeed


1
The Top 10 Reasons Why Federated Cant Succeed
  • And Why it Will Anyway

2
But First
  • What is our purpose as a community?
  • Produce (wonderful) new ideas
  • Structure the field
  • Educate the workforce

3
A Brief History of Federation
  • Multibase _at_1980
  • Many attempts since
  • Functional
  • Relational
  • Object-oriented
  • Logic-based
  • XML
  • Still not solved (think of last night)
  • And never will be?

4
Number 10 Robustness
  • Systems fail
  • Sources slow or unavailable
  • In a distributed system, more pieces
  • gt more failures
  • Users dont like failures

5
Number 9 Security
  • Different systems have different security
    mechanisms
  • Hard to create a single coherent view of
    permissions
  • Distributed systems are more vulnerable
  • More points of failure
  • Hard to make security guarantees
  • Data is often the corporate jewels
  • It must be protected

6
Number 8 Updates
  • Recording change isnt always an UPDATE
  • Application semantics must be accounted for
  • Application APIs must be reckoned with
  • ACIDity isnt always achievable
  • Not all data sources display ACID properties
  • Varying degrees of support
  • Strong transaction semantics not always possible
    or appropriate
  • And always painful
  • Changes to multiple sources must be coordinated
  • Requirements for consistency vary

7
Number 7 Configurability
  • Many architectures possible
  • Even with pre-existing sources, many choices
  • Little or no guidance on tradeoffs
  • Lots of code to install
  • Federation engine, data source clients
  • Often choices here
  • Lots of connections to define
  • Need tooling to support

8
Number 6 Administration
  • Monitoring is hard
  • Not all sources have facilities to track events
  • Variety of mechanisms for different events, and
    different sources
  • Not always APIs
  • Tuning is difficult
  • Need to understand what must change
  • Need to take appropriate actions
  • Repairing is painful
  • Distributed debugging
  • Different vendors to deal with for fixes

9
Number 5 Semantic heterogeneity
  • Hard to identify commonalities
  • Same terms, different meanings
  • Different terms, same meaning
  • Different structures representing different
    interpretations
  • Cant integrate data effectively without them
  • Cant make sensible queries

10
Number 4 Insufficient Metadata
  • Need metadata to integrate, configure, administer
    and query
  • Every data source has different metadata
  • No uniform standard
  • Not always collected
  • Tools to examine and exploit missing

11
Number 3 Performance (Data Movement)
  • Distributed queries involve moving data
  • Geographic distribution is common
  • WAN is slow
  • Large data volumes common
  • Large numbers of objects
  • Large objects
  • Caching isnt a complete answer
  • Changes can be frequent and hard to track
  • Storage is not unlimited

12
Number 2 Performance(Complexity)
  • Decision-support appls do complex queries
  • Many choices for how to execute
  • Big differences in performance among choices
  • Need data from diverse sources
  • May not have enough power in source
  • Performance at sources may vary
  • Need expensive functions of data
  • Function may not be implemented everywhere
  • Flowing the data to the function expensive

13
Number 1 Performance(Pathlength)
  • Simple queries (OLTP-like) incur huge overheads
  • Processing and networking costs
  • Simple queries are common
  • Easier to write
  • Automatically produced
  • Workflows

14
So Why Will Federated Succeed?
  • It has to
  • Integration one of the top IT issues
  • And its not going away
  • Alternatives are expensive and/or painful
  • Write it by hand
  • EAI/Workflow
  • Consolidation (warehouse, data marts)

15
So Why Will Federated Succeed? (2)
  • Simple scenarios exist
  • Dont need OLTP, high security, great robustness,
    for all applications
  • Customers know their data, or must learn anyway
  • Needs are so great, compromise is possible

16
So Why Will Federated Succeed? (3)
  • Progress on technology being made
  • 20 years of distributed query processing
  • Plumbing in place
  • Commit protocols
  • Reliable messaging
  • Connectivity infrastructure
  • XML (basic community agreement)
  • XML data format
  • XML schema
  • Web services
  • Were getting closer

17
What would we do if it ever did work?
  • Retire ?
  • Integrate the web?
  • Data grids
  • Data Google
  • P2P database?

18
For Discussion
  • Is research in this area warranted?
  • What are the most important research topics?
  • Did we miss any?
Write a Comment
User Comments (0)
About PowerShow.com