Orchestrating Messaging, Data Grid and Database

About This Presentation

Title:

Orchestrating Messaging, Data Grid and Database

Description:

'Add item to shopping cart' (update HTTP session) Internal state. Persistent State ... JavaEE: HTTP sessions (conversation between user and web server) ... – PowerPoint PPT presentation

Number of Views:108

Avg rating:3.0/5.0

Slides: 59

Provided by: patrick234

Category:

more less

Transcript and Presenter's Notes

Title: Orchestrating Messaging, Data Grid and Database

1
Orchestrating Messaging, Data Grid and Database

Jon Purdy
Oracle Corporation

2
Notes

Companies and Products
Oracle acquired Tangosol back in June
Coherence is a Data Grid solution
Questions are encouraged

3
Agenda

Technology Stack Overview
Introduction to Data Grid technology
Application State
Types of State
Challenges
Putting it together
How state is managed by application tiers
How to integrate application tiers
How Data Grids can fill in the gaps

4
Technology Stack Overview

There are many tools for building scalable,
reliable systems
Messaging
Application Servers
Data Grids
Databases
What types of state do these manage?
When should each one be used?

5
Technologies

Messaging
Integration between systems (queues)
Distributing relevant data (topics)
Application Servers
Request processing
Conversational state
Data Grids
Scalability and performance
Conversational state and/or limited persistent
state
Databases
Persistent state
Reliable, shared conversational state (if needed)

6
Technologies
Messaging
7
Data Grids What are they?

Special-purpose data management solution
Live, transactional data at in-memory speed
First class programmatic access
Built from the ground-up for in-memory efficiency
Avoids CPU overhead of disk management
Usually a native object view of data
Less flexible than a true database
Query optimization is an unsolvable problem
Three decades of RDBMS evolution offsets that
Less focus on long-term storage

8
Data Grids

Extend the coherency protocol to client
applications
Take advantage of the native object view of
data
Keep important data local for efficiency
OR/M can sometimes be slower than the actual
query
Implementations
Oracle Coherence
GemStone GemFire
IBM ObjectGrid

9
A Brief History
10
Relational DBMS

Relational DBMS
Relational structure allows any view of data
Minimizes impact of data schema mistakes
Databases for The People
With 4GL tools, led to the Client-Server
revolution
And even power users Microsoft Excel and Access
The critical ingredient Query Optimizer
DBMS assumes responsibility for optimizing data
access

11
Relational DBMS

But
Static optimization (RBO) is not 100 reliable
Dynamic optimization (CBO) is not 100 reliable
Mistakes magnified with scale and load
Scalability and availability problems

12
Object DBMS

Brief appearance in late 80s / early 90s
Some impressive performance feats
Extremely efficient for intended access patterns
Data schema coupled to business logic
Difficult to evolve data schema
Market segment as a whole has died
A few stragglers left

13
The best of all worlds

Take the efficiency of an Object DBMS
In-memory data coupled to application access
patterns
Consistent access patterns at runtime
Add scale-out as a primary objective
And leverage the RDBMS
Existing storage resources and skills
Loosely coupled data schemas

14
How does it work?
15
Partitioned Cache
16
Partitioned Cache
17
Partitioned Cache
18
Near Cache
19

Types of State
Characteristics

20
Types of State

Messages
Request/Response
Source user, message queue or another
application tier
Show inventory list (display web page in
browser)
Just a message from one system to another
Conversational State
Stateful Applications
Spans multiple requests (a conversation)
Add item to shopping cart (update HTTP session)
Internal state
Persistent State
Typically stored in a database
Place order (persist order to database)
Externally visible

21
Connecting the dots

Applications process requests, taking into
account the context of those requests, to manage
persistent data
Therefore, effective applications must ensure
that
Requests are properly processed
Proper context is maintained
Persisted data is correct
All of this is done in a timely manner

22
Characteristics Messages

Short-lived
Interactive apps milliseconds to a few seconds
Integration similar, unless one of the systems
is down
Immutable and single-writer pattern
By definition, each request submitted by a single
system
Almost no way to corrupt state, and easy to avoid
losing state
Stateless applications are very easy to scale
Simple request-response processing
Requests are often retry-able (idempotent)

23
Characteristics Conversational State

Longer-lived
A few seconds to several minutes
Mutable, but by a single user
Not quite single-writer
Simultaneous requests from a user
Multiple portlets in a portal application
Multiple clicks at the same time
Load-balancing issues failover/failback/rebalanci
ng
Often recoverable
Worst case, by restarting the session

24
Characteristics Persistent State

Long-lived
Rarely less than a few days often many decades
Often have regulatory requirements for several
years
Mutable and globally shared
Possible interaction and contention from all
users
Concurrency and data consistency are hard to
combine
The entire application shares one persistent state

25
Summary Managing State
26

Types of State
Challenges

27
Challenges

Messages
Most considerations relate to interactions
between systems
These interactions are effectively distributed
transactions
It is critical to manage these transactions
both reliably and efficiently

28
Challenges

Conversational state
Most applications can tolerate modest corruption
(or loss) of conversational state (or do anyway)
Those that cant assume this will generally place
this state in a reliable data store, or avoid
conversational state altogether
While technology solutions exist, scaling
stateful applications remains a challenge

29
Challenges

Persistent state
As the System of Record, persistent state is
the most valuable asset
Databases are the default option for properly
managing persistent state
However, scaling and performance concerns often
move data management out of the database,
increasing the difficulty of managing it correctly

30
Impact of lost/corrupted data

Messages
User gets a failed request
User resubmits request (click again)
Impact limited in scope (one user) and time (one
request)
Conversational State
Users session is corrupted or missing
If detected by the system, user may need to log
in again and start over
If not detected, the user will usually (but not
always) notice
Impact limited in scope (one user) and time (one
session)
Persistent State
Persistent State is the primary objective!
For the user Payment received but order not
shipped
For everyone Inventory levels are incorrect
Impact is global for all users and for all time!

31
Critical Areas of Concern

Messages
Conversational State
Persistent State

Messaging
Compare, Contrast, Integrate

33
Messaging

Topics
One-to-many subscribers sign up to topics of
interest
All subscribers receive messages as they occur
Emphasis on fast delivery to many subscribers
(performance, scalability)
Queues
Used primarily for communication between two
systems
Physical decoupling of sender and receiver
Emphasis on reliable message delivery
(durability)
Implementations
TIBCO Rendezvous, IBM MQSeries

34
Messages

Requests typically flow through multiple systems
Message Queue ? App Server ? Database
Browser ? Web Server ? App Server ? Database
Ensure that each request is processed
even if a participating service fails
Failure of either client or server can result in
dropped or duplicated requests
Most common requirement is once and only once
but other variants may be acceptable (at most
once, at least once)

35
Traditional Message Processing

Integrating multiple systems may require
distributed transactions (XA)
Distributed transactions
Simple to integrate minimal effect on
application architecture
E.g. enlist both the database and the queue
Slow (disk forces)
Tendency to cause lock contention (two-phase
locking)
Not 100 reliable (heuristic failures)
Not widely supported (lack of support,
compatibility issues)

36
Idempotency

Concept
If the client knows the server can handle
duplicate requests
Then the client can err on the side of re-sending
in doubt requests
A partial failure results in a complete retry
No need to use XA to coordinate client and server
Impact
May have a noticeable impact on application
architecture
Fast
Very reliable

37
Message Processing with XA

JMS begin TX
DB begin TX
Read message
Write to database
Prepare JMS
Prepare DB
Commit JMS
Commit DB
If the prepare phase fails in either JMS or DB,
the DB transaction is rolled back, and the JMS
message is left in the queue
If the commit phase fails, that is a heuristic
failure the state of the transaction is unknown

38
Idempotent Message Processing with Local
Transactions

JMS begin TX
DB begin TX
Read message
Write to DB (Idempotent)
Commit DB
Commit JMS
If commit to DB fails, the entire operation is
aborted the message is still in the queue
If commit to JMS fails, the JMS de-queue is
rolled back (but the DB commit isnt)
The next time the message is processed, the write
to the DB will occur, but the operation wont
have undesired side effects

39
Data Grid and Messaging

Data Grids can be used as a messaging fabric
But introduces global visibility of a new
infrastructure piece
Established players have more mature solutions
And operations team know these products
Messaging usually used within the Data Grid
Not between disparate applications
One exception
Data Grids can use write-behind queueing to avoid
the need for a dedicated message broker
Queue the messages in memory, not on disk
Slight reduction in durability but reduces
operating costs

Application Server
Compare, Contrast, Integrate

41
Application Servers

Application containers
Provide a framework for managing requests and
(usually) conversational state
May manage lifecycle of application deployment
packages
Also service directories (JNDI / Jini lookup
services)
Implementations
JavaEE WebLogic, WebSphere, JBoss, Oracle AS,
etc.
Compute Grid Platform Symphony, DataSynapse
GridServer
Jini Blitz, GigaSpaces
Spring
Requests
Route incoming requests (e.g. from TCP socket) to
application components
Conversational State
JavaEE HTTP sessions (conversation between user
and web server)
Jini JavaSpaces (conversation between multiple
processes)

42
Conversational State Topologies

In-memory (no replication)
Fastest, most scalable option
Server failure results in data loss
Single-server visibility (dependent on sticky
load balancer)
In-memory (replication)
Fast, scalable (implementations vary)
Widely available, sufficient for most use cases
Most implementations are not fully coherent under
load or failure
Database persistence
Higher complexity and lower performance
Achieves data consistency, commonly available
Scales with database server (for better or worse)

43
Conversational State

Unreliable conversational state
No in-memory replication (data loss)
Incoherent in-memory replication (data
corruption)
Tools
Idempotent processing
Reliable data store
Concept
Use application and data store to verify
correctness on commit
Verify order placement on web page
Use optimistic concurrency on database to check
values
Use idempotent processing to retry request chain
Buyer corrects shopping cart and resumes checkout
process
Or for closed-loop systems, recover missing
conversational state by replaying requests or
re-loading from database (selectively persisted
for performance)

Database
Compare, Contrast, Integrate

45
Database

The only real solution for persistence?
Permanent System of Record
Guaranteed data consistency
Operations
Perhaps the most widely deployed technology
In-house operations teams already know how to use
Strongest query technology (robust cost-based
optimizers)
Plenty of support 3rd party tool vendors,
consultants, documentation, discussion forums,
etc.

46
Database

Usually the easiest and most reliable solution
for managing persistent state
But supply
Absolute requirement for data consistency
Consistency requirements make scaling difficult
(but possible)
may not meet demand
Front tiers are inexpensive and easy to scale
Scaling on the front causes massive load on the
back
Offloading can help with managing persistent data
Eventually faces diminishing returns from
overhead and complexity

47
Offloading via Caching

Keep a local partial data set for faster access
Beneficial for read-heavy applications
Gained popularity by mitigating the EJB BMP N1
problem
Limited gains for transactions and queries
Relatively transparent to application
architecture
Weak requirements for data consistency
With optimistic concurrency, data consistency is
delegated to SoR
For presentation layer, dirty reads are often
acceptable

48
Offloading Analytics

Run queries against a copy of the System of
Record
System of Reference
Data consistency is important
Depends on usage
Generally operating against a point-in-time
snapshot
Data resilience is a Quality of Service
consideration
Recoverable from the System of Record
Failure will affect availability but not results

49
Offloading Events

Changes to the System of Record may need to
trigger additional processing
Challenges
Ensuring all changes of any relevant state are
handled in a timely manner
Absolute data consistency required for change
events and the context of those events (ordering,
subscribers, etc)
Hard to do all of these
Absolute data consistency
Fan-out of events from transactions
Timely delivery of events

50
Offloading Transactions

The System of Record must manage all transactions
related to its owned data
But a given piece of data may have different
owners over even short periods of time
Important to identify which system owns each
piece of data
Usually achieved by owning part of the
permanent store
Data consistency required

51
Data Grids can help

Conversational state
Combine the data consistency of a database with
the performance of local in-memory data
Persistent state
Running queries in the data grid can remove the
query load on a database
Committing transactions in-memory then persisting
in batches can reduce the transaction load of a
database
Abstraction of data sources

52
Data Source Integration - Read Through
53
Data Source Integration -Write Through
54
Data Source Integration -Write Behind
55
Data Grid Data Source Integration

Data Integration occurs in the Data Service
Integration uses the domain model
The data is both live and shared
Events provide bi-directional flow
Applications can respond to events

Data Service Clients
56
Summary of Data Grid Integration Points

Messaging
Data Grid can be used for internal application
messaging
Application Server
Scale data availability reliably along with
processing power
Database
Offload transactions and analytics to Data Grid
for higher throughput

57
The Spectrum
Integration
Messaging Topics, Queues
Data Consistency
Application Servers Requests JavaEE, Jini,
Compute Grid Conversational HTTP Sessions,
Stateful EJBs, JavaSpaces
Scalable Performance
Data Grids Data Grid, In-Memory Database
Database
58

Thank You!

Write a Comment

User Comments (0)

About PowerShow.com

Orchestrating Messaging, Data Grid and Database - PowerPoint PPT Presentation

Orchestrating Messaging, Data Grid and Database

'Add item to shopping cart' (update HTTP session) Internal state. Persistent State ... JavaEE: HTTP sessions (conversation between user and web server) ... – PowerPoint PPT presentation