Title: Wide area object data management solutions for LHC
1 Wide area object data management solutions for
LHC (Report from meeting with Jim Gray) Koen
Holtman Caltech/CMS Database workshop, CERN, 12
July 2001
2Introduction
- Meeting at Microsoft research, SFO, on 29 April
2001 - Jim Gray, Koen Holtman, Heinz Stockinger, Kurt
Stockinger - Discussed (among other things) LHC wide area data
management alternatives - All-Objectivity
- All-Oracle
- Hybrid solution (Object streaming libraryRDBMS)
- This talk reports on the discussion and adds some
more controversial points
3Data handling baseline 1
- Requirements (Hoffman review, year 2007)
- Object data model, typical objects 1KB-1MB
- 3 PB of storage space
- 10,000 CPUs
- 31 sites 1 tier0 5 tier1 25 tier2 all over
the world - I/O rates disk-gtCPU 10,000 MB/s, average 1
MB/s/CPU - RAW-gtESD generation 0.2 MB/s I/O
/ CPU - ESD-gtAOD generation 5 MB/s I/O
/ CPU - AOD analysis into histos 0.2 MB/s
I/O / CPU - DPD generation from AOD and ESD 10 MB/s I/O /
CPU - Wide-area I/O capacity order of 700 MByte/s
aggregate over all payload intercontinental
TCP/IP streams - This implies a system with heavy reliance on
access to site-local (cached) data
4(No Transcript)
5Abstract away Objectivity
- If we abstract away from the current Objy usage,
we are left with the following model - Many (grid) sites, each with a size of O(100)
CPUs - Local-area data access in terms of objects with
object persistency layer - Wide-area data transport in terms of files
- Might use deep copy to transport a selected subset
6The big assumption
- The big assumption is that this model remains
invariant irrespective of the persistency layer
used - The reasoning behind this is that you always need
to scale sites to the product scalability limit
which is lt10,000
7Mapping to persistency layers
- Objectivity/DB
- File Objectivity/DB file
- Use 100 separate federations, each serving data
to 100 CPUs. - 100 federations X 100 CPUs 10,000 CPUs
- Oracle
- File Table (represented as one ormore
physical files) - Use O(100) Oracle 'sites' which areloosely
federated - Hybrid
- File File written by object streaming library
- All files are tied together with metadata
maintained using RDBMS ( Grid catalogs)
8More about Oracle
- According to Jim Gray, Oracle can be used as a
persistency layer in this model - Could not think of a technical obstacle inside
Oracle which prevents us from doing this - Furthermore this use of oracle is not completely
crazy - Looks a bit like its use in some large companies
- We did not discuss directly other ORDBMS systems
9Utility of some RDBMS features
- What about the SQL query optimiser?
- Does not really speed up anything we do often
- See little utility for fast server side query
processing - What about replication features of Oracle and
other RDBMS? - These do not address our wide area data
distribution use case - These features are for re-synchronising a remote
table with a local one by applying a log of the
transactions - Not very performant, but well-liked
- These features might be interesting for us when
maintaining replicated metadata with lazy
synchronisation - What about materialised views?
- In theory our use case maps to them
- In practice, materialised view related optimisers
do not address our optimisation problem - In particular Oracle has no built-in concept of
'distance' to other site
10RDBMS pricing
- Standard commercial usage pricing for Oracle and
DB/2order of 20-50K per DB server CPU per year - Assume for 2007 1 server CPU can serve a 50 MB/s
stream - 10,000 MB/s / 50 MB/s 200 CPUs
- 200 CPUs 20-50K 4-10 M 8-20MCHF/year
- Compare CMS expected disk hardware expenditure in
2007 2 MCHF - Take 10 DB software cost overhead -gt 0.2 MCHF
- 0.2 M CHF / 8-20 MCHF 2.5-1
- So a 97.5-99 discount over standard market
prices is minimally needed - And keep this discount till 2030
11More about hybrid solution
- Hybrid
- Use object streaming library for 'bulk data'
- Use RDBMS technology to maintain metadata
- How complex does object streaming library have to
be? - Compared to 1997 we expect the persistency layer
to solve a smaller part of the problem, so
build-your-own or get-from-open-source has become
more feasible - Do not need a clone of Objectivity DRO/FTO
- Not even a clone of Objectivity
- No transactions or other consistency management
- We or the Grid projects have to do this ourselves
anyway in the wide area - Minimal functionality
- OID (file ID (gt32 bit!) , file-internal ID)
- Read objects from files in portable way
- Write new objects to a new file or end of
existing file - No object update! Versioning support at
metadata level
12Comparison
- Objectivity
- We know how to make it work, existing knowledge
base - Our code already works with it
- - Long-term availability uncertain (2006-2030?)
- Oracle ( other RDBMSes?)
- Long term availability more certain
- - Need to get experience using RDBMS style and
Oracle object handling features - - Cost of code conversion
- - Can we afford it? In the long term?
- Hybrid solution (Object streaming libraryRDBMS)
- Is safest for long-term stability of code and
accessibility of data - Need fewer RDBMS licenses, could even use free
RDBMSes? - - Need to write or choose object streaming
library - - Still need to get experience with RDBMS style
to code the handling of metadata
13Conclusions
- Resign to a multi-site model
- Compared to 1997 model this means we get less
from the database - Have to do more ourselves or get Grid projects to
do it - Multi-site model does give more options for
database (persistency layer) selection - Are not looking for the holy grail anymore
- Object streaming library becoming more attractive
- For the 2002 database selection we have Ngt3
options - Make decison based on tradeoff between risks,
manpower, money, etc.