Group Outbrief: Data Replication - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Group Outbrief: Data Replication

Description:

Ken Birman, John Connolly, Dan Geer, Barbara Liskov, Peng Liu, Mike Reiter ... Research topic: Develop new protocols that can handle these torrents of data ... – PowerPoint PPT presentation

Number of Views:58
Avg rating:3.0/5.0
Slides: 20
Provided by: csCol
Category:

less

Transcript and Presenter's Notes

Title: Group Outbrief: Data Replication


1
Group Outbrief Data Replication
  • Ken Birman, John Connolly, Dan Geer, Barbara
    Liskov, Peng Liu, Mike Reiter ( others)

2
Need for a layered process
NSF long term program
  • NSF runs mix of long and short-term research
    projects
  • But industry expects concretely applicable
    solutions (ideally, in COTS products)
  • Suggests needs for a multi-stage process

NSF near term program
Industry research partners demonstrate value
Actual use, COTS uptake
3
A problem with multiple dimensions
  • Big/small institutions
  • The big ones already run many data centers and
    have huge in-house capabilities
  • Small ones may outsource, limiting their options
  • Focus on cost-cutting and productivity
  • Technology highly appealing if it also cuts costs
  • But some CIP issues transcend cost

4
Problems and non-problems
  • Our challenge?
  • Offer replication technologies that both enhance
    CIP and offer other measurable cost benefits
  • Also consider new capabilities that improve user
    productivity and competitiveness
  • Not every research product will reach industry
    overnight. The pipeline can be slow!

5
An old problem with new facets
  • Massive scale of financial data centers poses
    challenges but also opportunities
  • Desire hot standbys many ramifications
  • Degree of consistency needed must be studied
    perhaps spectrum of requirements
  • Need to repair and recovery as quickly as
    possible after a major disruption
  • These are systems of systems huge numbers of
    components running concurrently

6
The nature of data is changing
  • Data rates are growing rapidly
  • A range of consistency requirements
  • Big transactions have stringent requirements
  • For other purposes, weaker needs apply
  • Might use AI techniques to recognize anomalies
    and perhaps event to repair in some cases
  • Archival issues pose a range of concerns
  • Data decay has become a worry
  • Need to store data for years yet to be sure of
    deletion of data later

7
Scalability poses questions of its own
  • Massive scale of financial data centers poses
    challenges but also opportunities
  • High data rates stress solutions high latency
    relative to throughput also a new issue
  • Diversity of transactions
  • Some transactions are of critical value, others
    less so criteria for a good solution will vary
    (extreme reliability vs throughput)
  • Some transactions arent even directly
    represented as such (for example, publication of
    the 10-year bond rate a transaction but not of
    a conventional sort also update to application
    state)
  • Even to say the data is risky we need a
    spectrum of options for a spectrum of uses

8
Organization of research agenda items
  • Traverse the stack bottom up
  • Data transport
  • Core replication technologies
  • Systems-level issues
  • Software engineering tools
  • Higher level systems of systems issues

9
Research in data transport
  • Explore ways of connecting data centers to
    extremely high-bandwidth networks
  • E.g. lambda rail 32 x 10Gb (later 40Gb)
    ethernets
  • Example of a high value testbed
  • In fact many testbeds are needed (but thats
    another topic)
  • But latencies are (relatively) high
  • TCP wont work in such settings
  • Solution should benefit many communities
  • Research topic Develop new protocols that can
    handle these torrents of data
  • May need to encrypt or compress data on the fly
  • End-to-end correctness criteria needs to be
    revisited
  • Could BFT be dropped to a low level in some
    way?
  • Must imagine clusters at both ends of any link

10
Scalability of replication
  • Scalability of replication technologies
  • Classical deterministic schemes with huge numbers
    of groups, overlapping groups, huge groups, or
    groups with some members far away
  • Scalable BFT? Other hardened schemes?
  • Probabilistic techniques advantages/limitations.
    Programming with probabilistically consistent
    substrates
  • Opportunity to use innovative techniques to
    detect and heal inconsistency in massively
    replicated systems, or even across systems

11
Replication with barriers
  • Institutional barriers have proliferated
  • E.g. trading/investment banking, research/private
    client investment
  • These create logical barriers visible at the
    replication layer and offer a model of certain
    kinds of insider threats
  • Scenario creates new research challenges
    replication aware of barriers
  • Need a representation of business rules that can
    be used to infer constraints

12
Theory of replication
  • Even what we know needs to be re-examined!
  • Can groups be oblivious to one-another, or must
    some form of system-wide consensus be employed to
    obtain consistency?
  • Are there inherently non-scalable properties, or
    just implementations?
  • When should a large-scale system use
    probabilistic techniques and tools vs.
    deterministic ones?
  • What issues arise in systems of systems?

13
Transactions and BFT
  • Database transactions widely used with replicated
    data
  • Can we adapt transactions to backup data
    centers?
  • Move to asynchronous transaction semantics?
  • We also know about Byzantine fault tolerance in
    servers. But what about Byzantine clients?
  • Protection against seemingly legitimate clients
    who seek to misuse API to disrupt a system
  • Explore extension of Byzantine model to cover the
    full spectrum of issues
  • Doing this would have the added benefit of
    protection against many kinds of accidents

14
Software Engineering Options
  • Could a compiler automate the creation and
    management of checkpoints, for transmission to a
    remote backup site?
  • How can data replication be integrated into
    programming environments, like .NETs CLR?
  • Type checking issues, optimization challenges,
    correct presentation of replication technology
  • Must seamlessly equip developers with tools to
    build better, less complex, systems.
  • For example, aspect oriented programming is
    yielding productivity benefits.

15
Virtualization of execution env.
  • Use virtualization to enhance functional
    replication opportunities (and security)
  • A virtualized system can more easily be
    reconstituted after a major disruption
  • Data replication is the key to making this work
  • Also offers potential for sandboxing, containing
    many kinds of security breaches
  • Trading room in a box offers intuition into the
    goal here. Open the box, turn the key

16
Higher level systems issues
  • Major institutions run systems that talk to
    counterparts in other systems
  • Need attention to rule/policy representation,
    composition, reconciliation
  • Concerns about consistency of this type of data
  • Conjunction of firewalls, barriers with need to
    communication
  • Aggressive replication brings new risks,
    vulnerability associated with rapid change

17
Monitoring, management, control
  • Develop new methods for managing and monitoring
    and controlling systems
  • Goal would be to automate what is now manual
  • but simultaneously to facilitate automated
    regeneration of lost capabilities in the event of
    an outage or attack
  • Industry gains lower cost of ownership
  • CIP gains a lever for automating robustness

18
Replication and outsourcing
  • Smaller institutions outsource data warehousing
    and processing needs
  • Hence they would benefit from anything that a
    large institution requires
  • Converse problem concentration risk due to
    outsourced functionality, shared infrastructure,
    service providers who manage to corner some
    critical role

19
Summary?
  • Industry knows a lot about replication and uses
    replication extensively
  • Yet despite this knowledge, there is a great deal
    that we dont know
  • Solving such problems offers us a chance to
    reduce cost of ownership for data centers
  • App development, Q/A, production, upgrades
  • This, in turn, offers leverage for CIP concerns
Write a Comment
User Comments (0)
About PowerShow.com