Object%20Persistency%20 - PowerPoint PPT Presentation

About This Presentation
Title:

Object%20Persistency%20

Description:

Using a strict component approach allows to split the ... 1.5TB on Mammoth tapes (1.TB/day peak) Being used. to store Monte Carlo data. D0 analysis tasks ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 37
Provided by: dirk80
Category:

less

Transcript and Presenter's Notes

Title: Object%20Persistency%20


1
Object Persistency Data HandlingSession C
- Summary
  • Dirk Duellmann

2
(No Transcript)
3
(No Transcript)
4
Espresso Feasibility Study
  • We identified solutions for most critical
    components of a scalable and performant ODBMS
  • Prototype implementation shows promising
    performance and scalability
  • Using a strict component approach allows to split
    the effort into independently developed,
    replaceable modules.
  • The development of an Open Source ODBMS seems
    possible within the HEP or general science
    community
  • A collaborative effort of the order of 15 person
    years seems sufficient to produce such a system
    with production quality

5
(No Transcript)
6
(No Transcript)
7
(No Transcript)
8
HERA
  • ZEUS
  • Objectivity based TagDB in production
  • significant performance gain for event selection
  • H1
  • H1 will move to an analysis and event display
    framework based on ROOT
  • DST and micro-DST (based on BOS PAW) will be
    replaced by analysis objects stored in ROOT trees
  • HERA-B
  • Conditions database based on Berkeley DB
  • ROOT current being integrated

9
Files Metadata Approach
  • RHIC
  • STAR moved from Objectivity to ROOT I/O
  • ROOT files for event data
  • file catalogue implemented using mySQL
  • PHENIX
  • ROOT files for event data
  • Objectivity/DB for conditions, configuration and
    file catalogue
  • Fermilab Run II
  • CDF
  • ROOT for event data
  • file catalogue stored in Oracle
  • D0
  • D0OM for event data
  • Metadata based on Oracle

10
Sequential Access Model
  • Integrated information about
  • Tape volumes
  • File catalogue
  • Runs
  • Event properties
  • Trigger configuration
  • Uses Enstore as MSS
  • 1.5TB on Mammoth tapes (1.TB/day peak)
  • Being used
  • to store Monte Carlo data
  • D0 analysis tasks

11
(No Transcript)
12
(No Transcript)
13
(No Transcript)
14
Mass Storage Systems
  • The CASTOR project at CERN has moved into
    production
  • staging system backward compatible with SHIFT
    with additional HSM functionality
  • main client will be COMPASS (_at_ 35MB/s)
  • planned ALICE Mock Data Challenge to prove
    feasibility of 100MB/s over one week
  • EUROSTORE - Esprit project over the last 2 years
  • Parallel Filesystem (QSW) HSM (DESY)
  • Prototype installation testing (CERN)
  • Operational system has been demonstrated
  • Follow-on proposal has been submitted with the
    aim to provide fully tested product including a
    LINUX port
  • Deployment at DESY foreseen for end 2000

15
(No Transcript)
16
(No Transcript)
17
(No Transcript)
18
(No Transcript)
19
(No Transcript)
20
(No Transcript)
21
(No Transcript)
22
(No Transcript)
23
(No Transcript)
24
(No Transcript)
25
Language Binding Insulation
  • Language Support - at least for C
  • and JAVA
  • or (in some cases) FORTRAN
  • Trade-off between
  • Risk for Experiment Code- insulation against
    change of persistency solution
  • Maintainability- additional manual work and many
    additional classes
  • Transparency for End Users- as simple to use as
    transient data
  • Flexibility - more than one storage solution at
    the same time- implement workable schema
    evolution
  • Performance- e.g. is I/O on demand needed ?- if
    yes, what is the right granularity One object?
    One event?

26
(No Transcript)
27
(No Transcript)
28
(No Transcript)
29
(No Transcript)
30
(No Transcript)
31
Language Binding Insulation
  • Two main approaches are used
  • In place access of persistent objects
  • The framework is implemented using language
    binding
  • Only C pointers to persistent data are exposed
    to user (CMS, STAR, PHENIX)
  • Access of transient copies
  • Complete conversion into transient objects
    (BaBar)
  • On demand conversion into transient objects using
    smart pointers (LHCb)
  • Experiment specific insulation layer
  • usually coupled to a specific application
    framework
  • In both cases split into two interfaces
  • framework uses more flexible, performant, exposed
    lower level
  • end user uses more insulated, transparent,
    customised higher level
  • Is the mapping layer in between really experiment
    specific?

32
Schema Evolution Object Conversion
  • BaBar - Objectivity/DB
  • presented conversion scheme using their
    transient/persistent mapping scheme
  • Star - ROOT
  • implemented an additional conversion mechanism
    which replaces the user schema evolution provided
    by ROOT
  • CLEO III - Objectivity/DB
  • implements a system based on opaque data objects
    stored in Objectivity
  • Is a experiment independent implementation of
    schema evolution possible?

33
From Data Storage to Data Management
  • Consistent management of a distributed data store
  • needs knowledge about semantics of the data
  • which files belong to one event collection, run,
    calibration period
  • they should be discarded together
  • staged together
  • exported together
  • Strong coupling to system details
  • which application logic
  • batch system
  • mass storage system
  • Significantly larger functionality complexity
  • significant development effort

34
Performance Optimisation of Complex Storage
Systems
  • Successful system optimisation requires
    correlated diagnostics on all levels
  • Mass Storage System
  • number of mounted tapes, file lifetime in disk
    pool
  • Data Server
  • I/O per server, per filesystem, per network
    interface
  • Lock Server
  • number of locks, number of waiting processes,
    locked resources
  • Client Host
  • I/O per client, per filesystem, per machine,
    total CPU usage
  • number of running processes
  • Client Application
  • number of used objects, containers and databases,
    transaction timing
  • regular profiling runs
  • All system components need monitoring
    instrumentation
  • understanding of chaotic areas like analysis
    servers is definitely non-trivial

35
Transactions Recovery
  • Are transactions needed to allow fail safe
    concurrent access?
  • Is it cheaper/easier to work in the old (manual)
    way?
  • With sequential recovery just throw away the
    last file, the last group of files, change some
    meta data
  • Application level consistency checks?
  • IT industry seems to have a different opinion
  • Use transactions to enforce consistency between
    the different parts of the store
  • Is the recovery of our data and meta data really
    that much simpler?
  • How does one integrate transactions in mutiple
    storage systems ?
  • The production experience of the next generation
    of experiments will tell us more.

36
Summary of the Summary
  • Significant progress in providing object
    persistency for a real life experiment
  • BaBar successfully went into production with an
    ODBMS based store
  • Management of complex systems is a significant
    effort
  • Solutions for schema evolution, insulation
    layers, data import/export have been developed
    for specific experiment frameworks
  • Can some of those solutions be generalised?
  • Still open questions
  • direct use of persistent objects or converted
    copies ?
  • single ODBMS system or files metadata in an
    RDBMS ?
  • More experience needed from running experiments
  • RHIC and Fermilab runII experiments will soon be
    able to tell us more
Write a Comment
User Comments (0)
About PowerShow.com