LCG POOL, Distributed Database Deployment and Oracle Services@CERN Dirk D PowerPoint PPT Presentation

presentation player overlay
About This Presentation
Transcript and Presenter's Notes

Title: LCG POOL, Distributed Database Deployment and Oracle Services@CERN Dirk D


1
LCG POOL, Distributed Database Deployment and
Oracle Services_at_CERNDirk Düllmann, CERNHEPiX
Fall04, BNL
  • Outline
  • POOL Persistency Framework and its use in LHC
    Data Challenges
  • LCG 3D Project scope and first ideas for a LCG
    Database service
  • CERN Databases Services for Physics plans

2
POOL Objectives
  • To allow the multi-PB of experiment data and
    associated meta data to be stored in a
    distributed and Grid enabled fashion
  • various types of data of different volume and
    access pattern
  • event data, physics and detector simulation,
    detector data and bookkeeping data
  • Hybrid technology approach, combining
  • C object streaming technology (Root I/O), for
    the bulk data
  • transactionally safe Relational Database (RDBMS)
    services, (MySQL, Oracle) for catalogs,
    collections and other meta data
  • In particular, POOL provides
  • Persistency for C transient objects
  • Transparent navigation from one object across
    file and technology boundaries
  • Integrated with a external File Catalog to keep
    track of the file physical location, allowing
    files to be moved or replicated

3
POOL Storage Hierarchy
  • A simple and generic model is exposed to the
    application
  • A application may access databases (eg ROOT
    files) from a set of catalogs
  • Each database/file has containers of one
    specific technology (eg ROOT trees)
  • Smart Pointers are used
  • to transparently load objects into a client side
    cache
  • define object associations across file or
    technology boundaries

4
Mapping to Technologies
  • Identify commonalties and differences between
    technologies
  • Model adapts to (almost) any storage technology
    with direct access
  • Record identifier needs to be known before
    flushing to disk
  • Use of RDBMS rather conventional
  • No special object support in SQL required

5
POOL Component Breakdown
  • POOL is (mainly) a client-side package
  • Coupling to standard file, database and grid
    services
  • No specialized POOL servers!
  • Storage Manager
  • Streams transient C objects to/from disk
  • Resolves a logical object reference to a physical
    object
  • I/O via ROOT (rfio/dcache) or Database(Oracle/MySQ
    L/SQLite)
  • File Catalog
  • Maintains consistent lists of accessible files
    together with their unique identifiers (FileID),
  • Used to resolves the logical file reference (from
    a POOL Object ID) to a physical file
  • Collections
  • Defines (large) containers for objects (eg event
    collections) stored via POOL

6
POOL Grid Connected
7
POOL Standalone
8
Why a Relational Abstraction Layer (RAL)?
  • Goal Vendor independence for the relational
    components of POOL, ConditionsDB and user code
  • Continuation of the component architecture as
    defined in the LCG Blueprint
  • File catalog, collections and object storage run
    against all available RDBMS plug-ins
  • To reduced code maintenance effort
  • All RDBMS client components can use all supported
    back-ends
  • Bug fixes can be applied once centrally
  • To minimise risk of vendor binding
  • Allows to add new RDBMS flavours later or use
    them in parallel and are picked up by all RDBMS
    clients
  • RDBMS market is still in flux..
  • To address the problem of distributing data in
    RDBMS of different flavours
  • Common mapping of application code to tables
    simplifies distribution of RDBMS data in a
    generic application independent way

9
Software Interfaces and Plugins
10
POOL in 2004 Data Challenges
  • Experience with POOL framework gained in Data
    Challenges is positive!
  • No major POOL-related problems
  • Close collaboration between POOL developers and
    experiments invaluable!
  • EDG-RLS as POOL back-end catalog
  • Deployment based on Oracle services provided by
    CERN Database group
  • Stable service throughout the 2004 Data
    Challenges!
  • Input concerning performance and required
    functionality for future Grid File Catalogs
  • Successful integration and use in LHC Data
    Challenges!
  • Data volume stored in POOL 400TB!
  • Similar to that stored in / migrated from
    Objectivity/DB!

11
Why a LCG Database Deployment Project?
  • LCG today provides an infrastructure for
    distributed access to file based data and file
    replication
  • Physics applications (and grid services) require
    a similar services for data stored in relational
    databases
  • Several applications and services already use
    RDBMS
  • Several sites have already experience in
    providing RDBMS services
  • Goals for common project (LCG 3D)
  • increase the availability and scalability of LCG
    and experiment components
  • allow applications to access data in a
    consistent, location independent way
  • allow to connect existing db services via data
    replication mechanisms
  • simplify a shared deployment and administration
    of this infrastructure during 24 x 7 operation
  • Need to bring service providers (site technology
    experts) closer to database users/developers to
    define a LCG database service
  • Time frame First deployment in 2005 data
    challenges (autumn 05)

12
Project Non-Goals
  • Store all database data
  • Experiments are free to deploy databases and
    distribute data under their responsibility
  • Setup a single monolithic distributed database
    system
  • Given constraints like WAN connections one can
    not assume that a single synchronously updated
    database work and provide sufficient
    availability.
  • Setup a single vendor system
  • Technology independence and a multi-vendor
    implementation will be required to minimize the
    long term risks and to adapt to the different
    requirements/constraints on different tiers.
  • Impose a CERN centric infrastructure to
    participating sites
  • CERN is one equal partner of other LCG sites on
    each tier
  • Decide on an architecture, implementation, new
    services, policies
  • Produce a technical proposal for all of those to
    LCG PEB/GDB

13
Database Services at LCG Sites Today
  • In contact with database teams at
  • Tier1 ASCC, BNL, CNAF, GridKa, FNAL, IN2P3
    and RAL
  • Potential Tier2 ANL and U Chicago
  • Several sites provide Oracle production services
    for HEP and non-HEP applications
  • Significant deployment experience and procedures
    exist but can not be changed easily without
    affecting other site activities
  • MySQL is very popular in the developer community
  • Initial choice often made by s/w developers
  • Not always with full deployment picture in mind
  • MySQL used for production purposes in LHC
  • Not at very large scale, though
  • Expected to deployable with limited db
    administration resources
  • Several applications are bound to MySQL
  • Expect a significant role for both database
    flavors
  • To implement different parts of the LCG
    infrastructure

14
Situation on the Application Side
  • Databases are used by many applications in the
    LHC physics production chains
  • Project members from ATLAS, CMS and LHCb
  • Alice interested in replication for online / T0
  • Currently many of these applications are run
    centralized
  • Several of these applications expect to move to a
    distributed model for scalability and
    availability reasons
  • Move to distributed mode can be simplified by a
    generic LCG database distribution infrastructure
  • Still, this will require some development work
  • Need to make key applications vendor neutral
  • DB abstraction layers become available in many
    foundation libraries
  • Applications which are only available for one DB
    vendor limit deployment
  • Distribution would profit from policy in the area

15
Distributed Databases vs. Distributed Caches
  • FNAL experiments deploy a combination of http
    based database access with web proxy caches close
    to the client
  • Performance gains
  • reduced real database access for largely
    read-only data
  • reduced transfer overhead compared to low level
    SOAP RPC based approaches
  • Deployment gains
  • Web caches (eg squid) are much simpler to deploy
    than databases and could remove the need for a
    local database deployment on some tiers
  • No vendor specific database libraries on the
    client side
  • Firewall friendly tunneling of requests through
    a single port
  • Expect cache technology to play a significant
    role towards the higher tiers which may not have
    the resources to run a reliable database service

16
Application s/w stack and Distribution Options
client s/w
APP
RAL relational abstraction layer
RAL
web cache
network
SQLite file
web cache
Oracle
MySQL
db cache servers
db file storage
17
Tiers, Resources and Level of Service
  • Different requirements and service capabilities
    for different tiers
  • Tier1 Database Backbone
  • High volume, often complete replication of RDBMS
    data
  • Can expect good network connection to other T1
    sites
  • Asynchronous, possibly multi-master replication
  • Large scale central database service, local dba
    team
  • Tier2
  • Medium volume, often only sliced extraction of
    data
  • Asymmetric, possibly only uni-directional
    replication
  • Part time administration (shared with fabric
    administration)
  • Higher Tiers and Laptop extraction
  • Support fully disconnected operation
  • Low volume, sliced extraction from T1/T2
  • Need to deploy more than one replication/distribut
    ion technology
  • Each addressing specific parts of the
    distribution problem
  • But all together forming a consistent
    distribution model

18
Starting Point for a Service Architecture?
O
T0 - autonomous reliable service
T3/4
T1- db back bone - all data replicated - reliable
service
T2 - local db cache -subset data -only local
service
O
O
M
19
3D Data Inventory WG
  • Collect and maintain a catalog of main RDBMS data
    types
  • Select from catalog of well defined replication
    options
  • which can be supported as part of the service
  • Conditions and Collection/Bookkeeping data are
    likely candidates
  • Experiments and grid s/w providers fill a table
    for each data type which is candidate for storage
    and replication via the 3D service
  • Basic storage properties
  • Data description, expected volume on T0/1/2 in
    2005 (and evolution)
  • Ownership model read-only, single user update,
    single site update, concurrent update
  • Replication/Caching properties
  • Replication model site local, all t1, sliced t1,
    all t2, sliced t2
  • Consistency/Latency how quickly do changes need
    to reach other sites/tiers
  • Application constraints DB vendor and DB version
    constraints
  • Reliability and Availability requirements
  • Essential for whole grid operation, for site
    operation, for experiment production,
  • Backup and Recovery policy
  • acceptable time to recover, location of backup(s)

20
3D Service Definition WG
  • DB Service Discovery
  • How does a job find a close by replica of the
    database it needs?
  • Need transparent (re)location of services - eg
    via a database replica catalog
  • Connectivity, firewalls and connection
    constraints
  • Access Control - authentication and authorization
  • Integration between DB vendor and LCG security
    models
  • Installation and Configuration
  • Database server and client installation kits
  • Which database client bindings are required (C,
    C, Java(JDBC), Perl, ..) ?
  • Server and client version upgrades (eg security
    patches)
  • Are transparent upgrades required for critical
    services?
  • Server administration procedures and tools
  • Need basic agreements to simplify shared
    administration
  • Monitoring and statistics gathering
  • Backup and Recovery
  • Backup policy templates, responsible site(s) for
    a particular data type
  • Acceptablelatency for recovery
  • Bottom line service effort should not be
    underestimated!

21
Replacement CERN Physics Services
  • Current systems not scalable to initial
    exploitation phase of LHC
  • Disk server poor match for DB needs Sun Cluster
    under-configured
  • Tests of Oracle 10g RAC on Linux, as proposed to
    PEB, promising
  • Main goals of replacement service
  • Isolation 10g services and / or
    physical separation
  • Scalability - in both database processing
    power and storage
  • Reliability automatic failover in case of
    problems
  • Manageability significantly easier to
    administer than now
  • Timeline Price Enquiries for front-end PCs, SAN
    infrastructure and SAN storage completed
  • Orders made, delivery November, pre-production
    early 2005?

22
SAN-based DB infrastructure
ATLAS CMS LHCb ALICE COMPASS HARP
spare spare
Mid-range Linux PCsdual power supply,mirrored
systemdisk, with dual HBAsmultiple GbitE (3)
F/C switches2 x 64 ports
Storage 16 x 400GB disks
23
Oracle Contract / Distribution
  • Client run-time
  • Oracle 10g Instant Client
  • currently Linux Windows
  • Mac still pre-production needs PEB decision re
    any support (incl. local copy)
  • 10g Early Adopters Release 2 (10.1.0.3)
  • Client developer (Software Developers Kit SDK)
  • Interim solution based on CERN re-packaging
  • Official support expected with 10.1.0.3
  • Expected around December (OracleWorld) (when 10g
    R2 will be announced)
  • Server distributions
  • Support issues should not be underestimated
  • See Oracle Security Alert Data Volume, 3D
    scenarios, etc.
  • Oracle Standard Edition One will be evaluated
  • Client bundling with ATLAS s/w
  • Oracle have agreed to treat Tier0Tiern sites as
    equal parts of LCG
  • No issues regarding use of Oracle client for LCG
    at sites used for LCG
  • Access to distributions managed by IT-DB still
    needs to be controlled
  • Oracle Partner status would not provide any
    added value

24
Registration for Access to Kits
  • You must be registered in the CERN HR database
  • If you have a CERN computer account you already
    are
  • In all cases, must fill in Computer Account form
  • Give preferred existing account, valid e-mail
    address and sign
  • or
  • If no account, must also complete other fields,
    i.e. date of birth etc.
  • and Contract Addendum
  • Name, Institute, Signature
  • Much easier than booking a flight / hotel on the
    Web!
  • Incomplete / illegible forms cannot be processed

25
Account Registration Form
  • English version is at
  • http//it-div.web.cern.ch/it-div/documents/Compute
    rUsage/CompAccountRegistrationForm-English.pdf
  • Contains 3 sections
  • To be completed by the User
  • To be completed by the Group Administrator
  • To be Read and Signed by the User
  • Complete all MANDATORY fields in 1 3
  • If you have an existing computer account, please
    indicate it under section 2.
  • Please give also a valid e-mail address in
    section 2.
  • We do not want to generate and maintain a
    separate form just for these last two points!

26
Collaborator Agreement
  • Please provide all of the following information
  • Your name, signature and date
  • The institute that you work for
  • The CERN experiment on which you collaborate

27
Oracle Security Alert
  • Severity 1 Security Alert issued August 31
  • This is the highest security level
  • The vulnerability is high risk and requires
    little specialized knowledge to exploit.
  • Apply the patch and/or workaround to the affected
    products with the highest priority.
  • IT-DB began immediately to prepare plan for
    patching servers
  • Major effort, spanning 4 weeks
  • Lessons need to limit versions of Oracle s/w in
    use
  • Several versions of (8), 8i, 9i, 10g in use
  • Often dictated by constraints e.g. 3rd party s/w
    (non-Physics)
  • Need plan in place so can react quickly to future
    alerts
  • Propose
  • a) time slot when routine interventions can be
    performed
  • b) procedure for performing critical
    interventions when needed

28
Summary
  • The POOL persistency framework (http//pool.cern.c
    h) been integrated in three experiment s/w
    frameworks and has been successfully used in the
    2004 Data Challlenges
  • Some 400 TB of data store in POOL
  • POOL has been extended to support RDBMS data in
    multiple relational databases
  • The LCG 3D project (http//lcg3d.cern.ch) has
    together with the experiments and sites started
    to define and implement a distributed database
    service forTiers 0-2
  • Several potential experiment applications and
    grid services exist but need to be coupled to the
    upcoming services
  • Difference in available T0/1/2 manpower resources
    will result in different level of service
  • Currently defining the key requirements and
    verify/adapt a proposed service architecture
  • Need to start pragmatic and simple to allow for
    first deployment in 2005
  • A 2005 service infrastructure can only draw from
    already existing resources
  • Requirements in some areas will only become clear
    during first deployment when the computing models
    in this area firm up
  • Consolidation activities for Physics DB services
    at CERN
  • Oracle RAC on Linux investigated as main building
    block for LCG services
  • Pre-production service planned for mid-2005
Write a Comment
User Comments (0)
About PowerShow.com