LCG POOL, Distributed Database Deployment and Oracle Services@CERN Dirk D presentation

About This Presentation

Transcript and Presenter's Notes

Title: LCG POOL, Distributed Database Deployment and Oracle Services@CERN Dirk D

1
LCG POOL, Distributed Database Deployment and
Oracle Services_at_CERNDirk Düllmann, CERNHEPiX
Fall04, BNL

Outline
POOL Persistency Framework and its use in LHC
Data Challenges
LCG 3D Project scope and first ideas for a LCG
Database service
CERN Databases Services for Physics plans

2
POOL Objectives

To allow the multi-PB of experiment data and
associated meta data to be stored in a
distributed and Grid enabled fashion
various types of data of different volume and
access pattern
event data, physics and detector simulation,
detector data and bookkeeping data
Hybrid technology approach, combining
C object streaming technology (Root I/O), for
the bulk data
transactionally safe Relational Database (RDBMS)
services, (MySQL, Oracle) for catalogs,
collections and other meta data
In particular, POOL provides
Persistency for C transient objects
Transparent navigation from one object across
file and technology boundaries
Integrated with a external File Catalog to keep
track of the file physical location, allowing
files to be moved or replicated

3
POOL Storage Hierarchy

A simple and generic model is exposed to the
application
A application may access databases (eg ROOT
files) from a set of catalogs
Each database/file has containers of one
specific technology (eg ROOT trees)
Smart Pointers are used
to transparently load objects into a client side
cache
define object associations across file or
technology boundaries

4
Mapping to Technologies

Identify commonalties and differences between
technologies

Model adapts to (almost) any storage technology
with direct access
Record identifier needs to be known before
flushing to disk
Use of RDBMS rather conventional
No special object support in SQL required

5
POOL Component Breakdown

POOL is (mainly) a client-side package
Coupling to standard file, database and grid
services
No specialized POOL servers!
Storage Manager
Streams transient C objects to/from disk
Resolves a logical object reference to a physical
object
I/O via ROOT (rfio/dcache) or Database(Oracle/MySQ
L/SQLite)
File Catalog
Maintains consistent lists of accessible files
together with their unique identifiers (FileID),
Used to resolves the logical file reference (from
a POOL Object ID) to a physical file
Collections
Defines (large) containers for objects (eg event
collections) stored via POOL

6
POOL Grid Connected
7
POOL Standalone
8
Why a Relational Abstraction Layer (RAL)?

Goal Vendor independence for the relational
components of POOL, ConditionsDB and user code
Continuation of the component architecture as
defined in the LCG Blueprint
File catalog, collections and object storage run
against all available RDBMS plug-ins
To reduced code maintenance effort
All RDBMS client components can use all supported
back-ends
Bug fixes can be applied once centrally
To minimise risk of vendor binding
Allows to add new RDBMS flavours later or use
them in parallel and are picked up by all RDBMS
clients
RDBMS market is still in flux..
To address the problem of distributing data in
RDBMS of different flavours
Common mapping of application code to tables
simplifies distribution of RDBMS data in a
generic application independent way

9
Software Interfaces and Plugins
10
POOL in 2004 Data Challenges

Experience with POOL framework gained in Data
Challenges is positive!
No major POOL-related problems
Close collaboration between POOL developers and
experiments invaluable!
EDG-RLS as POOL back-end catalog
Deployment based on Oracle services provided by
CERN Database group
Stable service throughout the 2004 Data
Challenges!
Input concerning performance and required
functionality for future Grid File Catalogs
Successful integration and use in LHC Data
Challenges!
Data volume stored in POOL 400TB!
Similar to that stored in / migrated from
Objectivity/DB!

11
Why a LCG Database Deployment Project?

LCG today provides an infrastructure for
distributed access to file based data and file
replication
Physics applications (and grid services) require
a similar services for data stored in relational
databases
Several applications and services already use
RDBMS
Several sites have already experience in
providing RDBMS services
Goals for common project (LCG 3D)
increase the availability and scalability of LCG
and experiment components
allow applications to access data in a
consistent, location independent way
allow to connect existing db services via data
replication mechanisms
simplify a shared deployment and administration
of this infrastructure during 24 x 7 operation
Need to bring service providers (site technology
experts) closer to database users/developers to
define a LCG database service
Time frame First deployment in 2005 data
challenges (autumn 05)

12
Project Non-Goals

Store all database data
Experiments are free to deploy databases and
distribute data under their responsibility
Setup a single monolithic distributed database
system
Given constraints like WAN connections one can
not assume that a single synchronously updated
database work and provide sufficient
availability.
Setup a single vendor system
Technology independence and a multi-vendor
implementation will be required to minimize the
long term risks and to adapt to the different
requirements/constraints on different tiers.
Impose a CERN centric infrastructure to
participating sites
CERN is one equal partner of other LCG sites on
each tier
Decide on an architecture, implementation, new
services, policies
Produce a technical proposal for all of those to
LCG PEB/GDB

13
Database Services at LCG Sites Today

In contact with database teams at
Tier1 ASCC, BNL, CNAF, GridKa, FNAL, IN2P3
and RAL
Potential Tier2 ANL and U Chicago
Several sites provide Oracle production services
for HEP and non-HEP applications
Significant deployment experience and procedures
exist but can not be changed easily without
affecting other site activities
MySQL is very popular in the developer community
Initial choice often made by s/w developers
Not always with full deployment picture in mind
MySQL used for production purposes in LHC
Not at very large scale, though
Expected to deployable with limited db
administration resources
Several applications are bound to MySQL
Expect a significant role for both database
flavors
To implement different parts of the LCG
infrastructure

14
Situation on the Application Side

Databases are used by many applications in the
LHC physics production chains
Project members from ATLAS, CMS and LHCb
Alice interested in replication for online / T0
Currently many of these applications are run
centralized
Several of these applications expect to move to a
distributed model for scalability and
availability reasons
Move to distributed mode can be simplified by a
generic LCG database distribution infrastructure
Still, this will require some development work
Need to make key applications vendor neutral
DB abstraction layers become available in many
foundation libraries
Applications which are only available for one DB
vendor limit deployment
Distribution would profit from policy in the area

15
Distributed Databases vs. Distributed Caches

FNAL experiments deploy a combination of http
based database access with web proxy caches close
to the client
Performance gains
reduced real database access for largely
read-only data
reduced transfer overhead compared to low level
SOAP RPC based approaches
Deployment gains
Web caches (eg squid) are much simpler to deploy
than databases and could remove the need for a
local database deployment on some tiers
No vendor specific database libraries on the
client side
Firewall friendly tunneling of requests through
a single port
Expect cache technology to play a significant
role towards the higher tiers which may not have
the resources to run a reliable database service

16
Application s/w stack and Distribution Options
client s/w
APP
RAL relational abstraction layer
RAL
web cache
network
SQLite file
web cache
Oracle
MySQL
db cache servers
db file storage
17
Tiers, Resources and Level of Service

Different requirements and service capabilities
for different tiers
Tier1 Database Backbone
High volume, often complete replication of RDBMS
data
Can expect good network connection to other T1
sites
Asynchronous, possibly multi-master replication
Large scale central database service, local dba
team
Tier2
Medium volume, often only sliced extraction of
data
Asymmetric, possibly only uni-directional
replication
Part time administration (shared with fabric
administration)
Higher Tiers and Laptop extraction
Support fully disconnected operation
Low volume, sliced extraction from T1/T2
Need to deploy more than one replication/distribut
ion technology
Each addressing specific parts of the
distribution problem
But all together forming a consistent
distribution model

18
Starting Point for a Service Architecture?
O
T0 - autonomous reliable service
T3/4
T1- db back bone - all data replicated - reliable
service
T2 - local db cache -subset data -only local
service
O
O
M
19
3D Data Inventory WG

Collect and maintain a catalog of main RDBMS data
types
Select from catalog of well defined replication
options
which can be supported as part of the service
Conditions and Collection/Bookkeeping data are
likely candidates
Experiments and grid s/w providers fill a table
for each data type which is candidate for storage
and replication via the 3D service
Basic storage properties
Data description, expected volume on T0/1/2 in
2005 (and evolution)
Ownership model read-only, single user update,
single site update, concurrent update
Replication/Caching properties
Replication model site local, all t1, sliced t1,
all t2, sliced t2
Consistency/Latency how quickly do changes need
to reach other sites/tiers
Application constraints DB vendor and DB version
constraints
Reliability and Availability requirements
Essential for whole grid operation, for site
operation, for experiment production,
Backup and Recovery policy
acceptable time to recover, location of backup(s)

20
3D Service Definition WG

DB Service Discovery
How does a job find a close by replica of the
database it needs?
Need transparent (re)location of services - eg
via a database replica catalog
Connectivity, firewalls and connection
constraints
Access Control - authentication and authorization
Integration between DB vendor and LCG security
models
Installation and Configuration
Database server and client installation kits
Which database client bindings are required (C,
C, Java(JDBC), Perl, ..) ?
Server and client version upgrades (eg security
patches)
Are transparent upgrades required for critical
services?
Server administration procedures and tools
Need basic agreements to simplify shared
administration
Monitoring and statistics gathering
Backup and Recovery
Backup policy templates, responsible site(s) for
a particular data type
Acceptablelatency for recovery
Bottom line service effort should not be
underestimated!

21
Replacement CERN Physics Services

Current systems not scalable to initial
exploitation phase of LHC
Disk server poor match for DB needs Sun Cluster
under-configured
Tests of Oracle 10g RAC on Linux, as proposed to
PEB, promising
Main goals of replacement service
Isolation 10g services and / or
physical separation
Scalability - in both database processing
power and storage
Reliability automatic failover in case of
problems
Manageability significantly easier to
administer than now
Timeline Price Enquiries for front-end PCs, SAN
infrastructure and SAN storage completed
Orders made, delivery November, pre-production
early 2005?

22
SAN-based DB infrastructure
ATLAS CMS LHCb ALICE COMPASS HARP
spare spare
Mid-range Linux PCsdual power supply,mirrored
systemdisk, with dual HBAsmultiple GbitE (3)
F/C switches2 x 64 ports
Storage 16 x 400GB disks
23
Oracle Contract / Distribution

Client run-time
Oracle 10g Instant Client
currently Linux Windows
Mac still pre-production needs PEB decision re
any support (incl. local copy)
10g Early Adopters Release 2 (10.1.0.3)
Client developer (Software Developers Kit SDK)
Interim solution based on CERN re-packaging
Official support expected with 10.1.0.3
Expected around December (OracleWorld) (when 10g
R2 will be announced)
Server distributions
Support issues should not be underestimated
See Oracle Security Alert Data Volume, 3D
scenarios, etc.
Oracle Standard Edition One will be evaluated
Client bundling with ATLAS s/w
Oracle have agreed to treat Tier0Tiern sites as
equal parts of LCG
No issues regarding use of Oracle client for LCG
at sites used for LCG
Access to distributions managed by IT-DB still
needs to be controlled
Oracle Partner status would not provide any
added value

24
Registration for Access to Kits

You must be registered in the CERN HR database
If you have a CERN computer account you already
are
In all cases, must fill in Computer Account form
Give preferred existing account, valid e-mail
address and sign
or
If no account, must also complete other fields,
i.e. date of birth etc.
and Contract Addendum
Name, Institute, Signature
Much easier than booking a flight / hotel on the
Web!
Incomplete / illegible forms cannot be processed

25
Account Registration Form

English version is at
http//it-div.web.cern.ch/it-div/documents/Compute
rUsage/CompAccountRegistrationForm-English.pdf
Contains 3 sections
To be completed by the User
To be completed by the Group Administrator
To be Read and Signed by the User
Complete all MANDATORY fields in 1 3
If you have an existing computer account, please
indicate it under section 2.
Please give also a valid e-mail address in
section 2.
We do not want to generate and maintain a
separate form just for these last two points!

26
Collaborator Agreement

Please provide all of the following information
Your name, signature and date
The institute that you work for
The CERN experiment on which you collaborate

27
Oracle Security Alert

Severity 1 Security Alert issued August 31
This is the highest security level
The vulnerability is high risk and requires
little specialized knowledge to exploit.
Apply the patch and/or workaround to the affected
products with the highest priority.
IT-DB began immediately to prepare plan for
patching servers
Major effort, spanning 4 weeks
Lessons need to limit versions of Oracle s/w in
use
Several versions of (8), 8i, 9i, 10g in use
Often dictated by constraints e.g. 3rd party s/w
(non-Physics)
Need plan in place so can react quickly to future
alerts
Propose
a) time slot when routine interventions can be
performed
b) procedure for performing critical
interventions when needed

28
Summary

The POOL persistency framework (http//pool.cern.c
h) been integrated in three experiment s/w
frameworks and has been successfully used in the
2004 Data Challlenges
Some 400 TB of data store in POOL
POOL has been extended to support RDBMS data in
multiple relational databases
The LCG 3D project (http//lcg3d.cern.ch) has
together with the experiments and sites started
to define and implement a distributed database
service forTiers 0-2
Several potential experiment applications and
grid services exist but need to be coupled to the
upcoming services
Difference in available T0/1/2 manpower resources
will result in different level of service
Currently defining the key requirements and
verify/adapt a proposed service architecture
Need to start pragmatic and simple to allow for
first deployment in 2005
A 2005 service infrastructure can only draw from
already existing resources
Requirements in some areas will only become clear
during first deployment when the computing models
in this area firm up
Consolidation activities for Physics DB services
at CERN
Oracle RAC on Linux investigated as main building
block for LCG services
Pre-production service planned for mid-2005

Write a Comment

User Comments (0)

About PowerShow.com

LCG POOL, Distributed Database Deployment and Oracle Services@CERN Dirk D PowerPoint PPT Presentation