Title: LCG POOL, Distributed Database Deployment and Oracle Services@CERN Dirk D
1LCG POOL, Distributed Database Deployment and
Oracle Services_at_CERNDirk Düllmann, CERNHEPiX
Fall04, BNL
- Outline
- POOL Persistency Framework and its use in LHC
Data Challenges - LCG 3D Project scope and first ideas for a LCG
Database service - CERN Databases Services for Physics plans
-
2POOL Objectives
- To allow the multi-PB of experiment data and
associated meta data to be stored in a
distributed and Grid enabled fashion - various types of data of different volume and
access pattern - event data, physics and detector simulation,
detector data and bookkeeping data - Hybrid technology approach, combining
- C object streaming technology (Root I/O), for
the bulk data - transactionally safe Relational Database (RDBMS)
services, (MySQL, Oracle) for catalogs,
collections and other meta data - In particular, POOL provides
- Persistency for C transient objects
- Transparent navigation from one object across
file and technology boundaries - Integrated with a external File Catalog to keep
track of the file physical location, allowing
files to be moved or replicated
3POOL Storage Hierarchy
- A simple and generic model is exposed to the
application - A application may access databases (eg ROOT
files) from a set of catalogs - Each database/file has containers of one
specific technology (eg ROOT trees) - Smart Pointers are used
- to transparently load objects into a client side
cache - define object associations across file or
technology boundaries
4Mapping to Technologies
- Identify commonalties and differences between
technologies
- Model adapts to (almost) any storage technology
with direct access - Record identifier needs to be known before
flushing to disk - Use of RDBMS rather conventional
- No special object support in SQL required
5POOL Component Breakdown
- POOL is (mainly) a client-side package
- Coupling to standard file, database and grid
services - No specialized POOL servers!
- Storage Manager
- Streams transient C objects to/from disk
- Resolves a logical object reference to a physical
object - I/O via ROOT (rfio/dcache) or Database(Oracle/MySQ
L/SQLite) - File Catalog
- Maintains consistent lists of accessible files
together with their unique identifiers (FileID),
- Used to resolves the logical file reference (from
a POOL Object ID) to a physical file - Collections
- Defines (large) containers for objects (eg event
collections) stored via POOL
6POOL Grid Connected
7POOL Standalone
8Why a Relational Abstraction Layer (RAL)?
- Goal Vendor independence for the relational
components of POOL, ConditionsDB and user code - Continuation of the component architecture as
defined in the LCG Blueprint - File catalog, collections and object storage run
against all available RDBMS plug-ins - To reduced code maintenance effort
- All RDBMS client components can use all supported
back-ends - Bug fixes can be applied once centrally
- To minimise risk of vendor binding
- Allows to add new RDBMS flavours later or use
them in parallel and are picked up by all RDBMS
clients - RDBMS market is still in flux..
- To address the problem of distributing data in
RDBMS of different flavours - Common mapping of application code to tables
simplifies distribution of RDBMS data in a
generic application independent way
9Software Interfaces and Plugins
10POOL in 2004 Data Challenges
- Experience with POOL framework gained in Data
Challenges is positive! - No major POOL-related problems
- Close collaboration between POOL developers and
experiments invaluable! - EDG-RLS as POOL back-end catalog
- Deployment based on Oracle services provided by
CERN Database group - Stable service throughout the 2004 Data
Challenges! - Input concerning performance and required
functionality for future Grid File Catalogs - Successful integration and use in LHC Data
Challenges! - Data volume stored in POOL 400TB!
- Similar to that stored in / migrated from
Objectivity/DB!
11Why a LCG Database Deployment Project?
- LCG today provides an infrastructure for
distributed access to file based data and file
replication - Physics applications (and grid services) require
a similar services for data stored in relational
databases - Several applications and services already use
RDBMS - Several sites have already experience in
providing RDBMS services - Goals for common project (LCG 3D)
- increase the availability and scalability of LCG
and experiment components - allow applications to access data in a
consistent, location independent way - allow to connect existing db services via data
replication mechanisms - simplify a shared deployment and administration
of this infrastructure during 24 x 7 operation - Need to bring service providers (site technology
experts) closer to database users/developers to
define a LCG database service - Time frame First deployment in 2005 data
challenges (autumn 05)
12Project Non-Goals
- Store all database data
- Experiments are free to deploy databases and
distribute data under their responsibility - Setup a single monolithic distributed database
system - Given constraints like WAN connections one can
not assume that a single synchronously updated
database work and provide sufficient
availability. - Setup a single vendor system
- Technology independence and a multi-vendor
implementation will be required to minimize the
long term risks and to adapt to the different
requirements/constraints on different tiers. - Impose a CERN centric infrastructure to
participating sites - CERN is one equal partner of other LCG sites on
each tier - Decide on an architecture, implementation, new
services, policies - Produce a technical proposal for all of those to
LCG PEB/GDB
13Database Services at LCG Sites Today
- In contact with database teams at
- Tier1 ASCC, BNL, CNAF, GridKa, FNAL, IN2P3
and RAL - Potential Tier2 ANL and U Chicago
- Several sites provide Oracle production services
for HEP and non-HEP applications - Significant deployment experience and procedures
exist but can not be changed easily without
affecting other site activities - MySQL is very popular in the developer community
- Initial choice often made by s/w developers
- Not always with full deployment picture in mind
- MySQL used for production purposes in LHC
- Not at very large scale, though
- Expected to deployable with limited db
administration resources - Several applications are bound to MySQL
- Expect a significant role for both database
flavors - To implement different parts of the LCG
infrastructure
14Situation on the Application Side
- Databases are used by many applications in the
LHC physics production chains - Project members from ATLAS, CMS and LHCb
- Alice interested in replication for online / T0
- Currently many of these applications are run
centralized - Several of these applications expect to move to a
distributed model for scalability and
availability reasons - Move to distributed mode can be simplified by a
generic LCG database distribution infrastructure - Still, this will require some development work
- Need to make key applications vendor neutral
- DB abstraction layers become available in many
foundation libraries - Applications which are only available for one DB
vendor limit deployment - Distribution would profit from policy in the area
15Distributed Databases vs. Distributed Caches
- FNAL experiments deploy a combination of http
based database access with web proxy caches close
to the client - Performance gains
- reduced real database access for largely
read-only data - reduced transfer overhead compared to low level
SOAP RPC based approaches - Deployment gains
- Web caches (eg squid) are much simpler to deploy
than databases and could remove the need for a
local database deployment on some tiers - No vendor specific database libraries on the
client side - Firewall friendly tunneling of requests through
a single port - Expect cache technology to play a significant
role towards the higher tiers which may not have
the resources to run a reliable database service
16Application s/w stack and Distribution Options
client s/w
APP
RAL relational abstraction layer
RAL
web cache
network
SQLite file
web cache
Oracle
MySQL
db cache servers
db file storage
17Tiers, Resources and Level of Service
- Different requirements and service capabilities
for different tiers - Tier1 Database Backbone
- High volume, often complete replication of RDBMS
data - Can expect good network connection to other T1
sites - Asynchronous, possibly multi-master replication
- Large scale central database service, local dba
team - Tier2
- Medium volume, often only sliced extraction of
data - Asymmetric, possibly only uni-directional
replication - Part time administration (shared with fabric
administration) - Higher Tiers and Laptop extraction
- Support fully disconnected operation
- Low volume, sliced extraction from T1/T2
- Need to deploy more than one replication/distribut
ion technology - Each addressing specific parts of the
distribution problem - But all together forming a consistent
distribution model
18Starting Point for a Service Architecture?
O
T0 - autonomous reliable service
T3/4
T1- db back bone - all data replicated - reliable
service
T2 - local db cache -subset data -only local
service
O
O
M
193D Data Inventory WG
- Collect and maintain a catalog of main RDBMS data
types - Select from catalog of well defined replication
options - which can be supported as part of the service
- Conditions and Collection/Bookkeeping data are
likely candidates - Experiments and grid s/w providers fill a table
for each data type which is candidate for storage
and replication via the 3D service - Basic storage properties
- Data description, expected volume on T0/1/2 in
2005 (and evolution) - Ownership model read-only, single user update,
single site update, concurrent update - Replication/Caching properties
- Replication model site local, all t1, sliced t1,
all t2, sliced t2 - Consistency/Latency how quickly do changes need
to reach other sites/tiers - Application constraints DB vendor and DB version
constraints - Reliability and Availability requirements
- Essential for whole grid operation, for site
operation, for experiment production, - Backup and Recovery policy
- acceptable time to recover, location of backup(s)
203D Service Definition WG
- DB Service Discovery
- How does a job find a close by replica of the
database it needs? - Need transparent (re)location of services - eg
via a database replica catalog - Connectivity, firewalls and connection
constraints - Access Control - authentication and authorization
- Integration between DB vendor and LCG security
models - Installation and Configuration
- Database server and client installation kits
- Which database client bindings are required (C,
C, Java(JDBC), Perl, ..) ? - Server and client version upgrades (eg security
patches) - Are transparent upgrades required for critical
services? - Server administration procedures and tools
- Need basic agreements to simplify shared
administration - Monitoring and statistics gathering
- Backup and Recovery
- Backup policy templates, responsible site(s) for
a particular data type - Acceptablelatency for recovery
- Bottom line service effort should not be
underestimated!
21Replacement CERN Physics Services
- Current systems not scalable to initial
exploitation phase of LHC - Disk server poor match for DB needs Sun Cluster
under-configured - Tests of Oracle 10g RAC on Linux, as proposed to
PEB, promising - Main goals of replacement service
- Isolation 10g services and / or
physical separation - Scalability - in both database processing
power and storage - Reliability automatic failover in case of
problems - Manageability significantly easier to
administer than now - Timeline Price Enquiries for front-end PCs, SAN
infrastructure and SAN storage completed - Orders made, delivery November, pre-production
early 2005?
22SAN-based DB infrastructure
ATLAS CMS LHCb ALICE COMPASS HARP
spare spare
Mid-range Linux PCsdual power supply,mirrored
systemdisk, with dual HBAsmultiple GbitE (3)
F/C switches2 x 64 ports
Storage 16 x 400GB disks
23Oracle Contract / Distribution
- Client run-time
- Oracle 10g Instant Client
- currently Linux Windows
- Mac still pre-production needs PEB decision re
any support (incl. local copy) - 10g Early Adopters Release 2 (10.1.0.3)
- Client developer (Software Developers Kit SDK)
- Interim solution based on CERN re-packaging
- Official support expected with 10.1.0.3
- Expected around December (OracleWorld) (when 10g
R2 will be announced) - Server distributions
- Support issues should not be underestimated
- See Oracle Security Alert Data Volume, 3D
scenarios, etc. - Oracle Standard Edition One will be evaluated
- Client bundling with ATLAS s/w
- Oracle have agreed to treat Tier0Tiern sites as
equal parts of LCG - No issues regarding use of Oracle client for LCG
at sites used for LCG - Access to distributions managed by IT-DB still
needs to be controlled - Oracle Partner status would not provide any
added value
24Registration for Access to Kits
- You must be registered in the CERN HR database
- If you have a CERN computer account you already
are - In all cases, must fill in Computer Account form
- Give preferred existing account, valid e-mail
address and sign - or
- If no account, must also complete other fields,
i.e. date of birth etc. - and Contract Addendum
- Name, Institute, Signature
- Much easier than booking a flight / hotel on the
Web! - Incomplete / illegible forms cannot be processed
25Account Registration Form
- English version is at
- http//it-div.web.cern.ch/it-div/documents/Compute
rUsage/CompAccountRegistrationForm-English.pdf - Contains 3 sections
- To be completed by the User
- To be completed by the Group Administrator
- To be Read and Signed by the User
- Complete all MANDATORY fields in 1 3
- If you have an existing computer account, please
indicate it under section 2. - Please give also a valid e-mail address in
section 2. - We do not want to generate and maintain a
separate form just for these last two points!
26Collaborator Agreement
- Please provide all of the following information
- Your name, signature and date
- The institute that you work for
- The CERN experiment on which you collaborate
27Oracle Security Alert
- Severity 1 Security Alert issued August 31
- This is the highest security level
- The vulnerability is high risk and requires
little specialized knowledge to exploit. - Apply the patch and/or workaround to the affected
products with the highest priority. - IT-DB began immediately to prepare plan for
patching servers - Major effort, spanning 4 weeks
- Lessons need to limit versions of Oracle s/w in
use - Several versions of (8), 8i, 9i, 10g in use
- Often dictated by constraints e.g. 3rd party s/w
(non-Physics) - Need plan in place so can react quickly to future
alerts - Propose
- a) time slot when routine interventions can be
performed - b) procedure for performing critical
interventions when needed
28Summary
- The POOL persistency framework (http//pool.cern.c
h) been integrated in three experiment s/w
frameworks and has been successfully used in the
2004 Data Challlenges - Some 400 TB of data store in POOL
- POOL has been extended to support RDBMS data in
multiple relational databases - The LCG 3D project (http//lcg3d.cern.ch) has
together with the experiments and sites started
to define and implement a distributed database
service forTiers 0-2 - Several potential experiment applications and
grid services exist but need to be coupled to the
upcoming services - Difference in available T0/1/2 manpower resources
will result in different level of service - Currently defining the key requirements and
verify/adapt a proposed service architecture - Need to start pragmatic and simple to allow for
first deployment in 2005 - A 2005 service infrastructure can only draw from
already existing resources - Requirements in some areas will only become clear
during first deployment when the computing models
in this area firm up - Consolidation activities for Physics DB services
at CERN - Oracle RAC on Linux investigated as main building
block for LCG services - Pre-production service planned for mid-2005