ATLAS DQA: Data Quality Assessment - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

ATLAS DQA: Data Quality Assessment

Description:

... from Alina Corso-Radu, Claude Guyot, Michael Hauschild, Haleh Hadavand, Richard ... Well defined set of conditions for declaring data 'good' for specific uses ... – PowerPoint PPT presentation

Number of Views:385
Avg rating:3.0/5.0
Slides: 17
Provided by: robmcp
Category:

less

Transcript and Presenter's Notes

Title: ATLAS DQA: Data Quality Assessment


1
ATLAS DQAData Quality Assessment
  • Rob McPherson
  • UVic/IPP
  • Input from Alina Corso-Radu, Claude Guyot,
    Michael Hauschild, Haleh Hadavand, Richard
    Hawkings, Beate Heinemann, Andreas Hoecker, Steve
    Hillier, Bob Kehoe, Sergei Kolos, Rolf Seuster
    ...
  • (although the opinions are mine)
  • ATLAS Trigger Physics Week
  • 30 October 2006

2
Where we hope were headed
  • By first collisions 1 year from now we want
  • Well defined set of conditions for declaring data
    good for specific uses
  • Detector experts detector DQ status requirements
    and tests
  • Combined performance experts combined DQ status
    tests
  • Physics experts physics object DQ status
    validation
  • Fast online or near-real-time (up to a few hours
    based on express streams) error/warning flagging
  • Automatic notification of shift crew of problems
  • Histograms and other information immediately
    available for experts to evaluate problems (web
    and histogram access)
  • Fine monitoring from bulk reconstruction (24-48
    hours) and later reprocessing
  • Notification of production team of possible
    problems
  • Higher statistics histograms etc. available for
    expert analysis
  • Easy access to DQA status for TAG DB queries and
    from Athena

3
Where we are now
  • Significant and developing expertise in the
    subdetector communities for DQ checks
  • Developed/developing testbeam, integration,
    commissioning
  • Also offline for Monte Carlo and software
    validation
  • Detectors already working actively without a lot
    of central coordination and must build on that
    effort
  • Design of an online DQ monitoring framework, DQMF
  • See https//edms.cern.ch/document/770411/1.0
  • Draft of requirements for overall DQA
  • See https//edms.cern.ch/document/789017/1
  • Ongoing technical discussions between online and
    offline, but no conclusion yet
  • Not obvious if DQMF usable for Tier0, Tier1 ...
    DQA

4
  • I should probably just stop here
  • (you should be so lucky)

5
Data Flow
Front-end
DCS
  • DB from online
  • - config, calib
  • DCS,monitor

Tier 0
RODs
LVL1
express
calib
Offline DQA
RAW 200Hz 320MB/s
Fast reco, calibrate
LVL2
Online DQA
Verify
Event Builder
prompt calib digested status
Prompt reco (bulk)
SFI (s)
TAG DB
ESD 100MB/s AOD 20MB/s
EF
EF
EF
EF
Tier-1 Oracle replica
Tier-1 transfer
SFOs
Tier-2 transfer
Tier-2 replica
6
DQA use-cases
  • Data Quality
  • Starts from very basic DCS and DAQ status
  • Online histogram filling / merging / checking for
    initial feedback
  • Offline histogram filling / merging / checking
    for refined analysis
  • Online DQA based on DCS and early event
    monitoring
  • From ROD (including GNAM)
  • LVL1 and LVL2
  • At Event Filter? May not be resources for broader
    monitoring on EF
  • Remote monitoring farms??
  • Sampled monitoring from SFI or SFO?
  • Offline DQA
  • Tier 0 express / calib streams
  • Tier 0 bulk reprocessing streams
  • For validation and finding problems with higher
    statistics
  • Tier 1 reprocessing
  • For validation and finding problems with higher
    statistics
  • Tier 2?
  • Can imagine use-cases for validation production
    of specialized AOD
  • Will we validate Monte Carlo with this system?

7
Infrastructure DQMF Status I
  • Data Quality Monitoring Framework (DQMF) designed
    primarily for online use
  • Active developer team
  • Haleh Hadavand, Michael Hauschild, Bob Kehoe,
    Sergei Kolos, Alina Corso-Radu ...
  • Two documents on DQMF
  • Requirements https//edms.cern.ch/document/719917
    /1.0/
  • Architecture https//edms.cern.ch/document/770
    411/1.0
  • DQMF design assumes strong dependencies on online
    software packages and services
  • Compatibility with offline use under discussion

8
DQMF Status II
  • DQMF design (Sergei Kolos et al.)
  • DQMF core
  • Depends on online system for configuration,
    control, access to online histo service,
    monitoring (histogram) archive, information
    service (including DCS info), and error reporting
    service

9
Central DQ infrastructure where are we headed
  • Have (at least) four ways to proceed
  • Assume full installation of online software
    release everywhere DQMF will ever run (Tier 0,
    Tier 1, Tier2, development desktops)
  • Build interface layer for DQMF so that it
    performs all communication via interfaces and
    does not depend on online software, providing
    corresponding offline implementations
  • Extract core histogram checking etc. algorithms
    and build another engine for driving them
    offline (at least keep same checking ability
    online and offline)
  • Redevelop everything from scratch in offline
  • Options have different implications
  • Very unlikely to happen
  • Requires significant work from online offline
    SW groups
  • Need to re-invent the engine for driving checks
    requiring significant offline effort
  • Complete decoupling will lead to divergence and
    the most effort in the long term

10
For use at Tier 0/1/2
  • Integration with production system needed
  • Inclusion of monitoring tools into the offline
    data reconstruction transformation.
  • Production of a monitoring histogram file or
    files in each reconstruction job.
  • Merging of monitoring histogram files.
  • Archiving and cataloguing histograms into the
    ATLAS distributed data management system.
  • Checking histograms against reference histograms
    (using the DQMF framework?)
  • Generate warnings or errors.
  • Report errors back to the production system for
    notification of experts.
  • Update data quality status in the conditions
    database for use by offline analysis.
  • None of this exists (yet) in a central system

11
DQ data archive
  • Stores histograms being produced for monitoring
  • Might also be source of reference histograms (to
    be decided)
  • Online system design exists
  • Monitoring Data Archive, MDA, Federico Zema
  • Built around fast disk buffers and castor
  • See https//edms.cern.ch/document/713107/1.1
  • For offline use
  • Want a system built around standard conditions DB
    tools
  • Require offline and expert access, including
    adding new histograms
  • Histograms etc. need to be part of standard
    DDM/DQ2 so sites can subscribe to there desired
    data as needed
  • None of this exists yet in offline

12
DQ display/analysis environment
  • Many users/developers may not have access to
    Online Histogram Presenter
  • A web-based browsing system desirable for
    convenient expert checks, but may not be enough
    for detailed diagnosis and analysis
  • Want to develop a simple offline system for DQ
    tool development and expert analysis
  • File interface
  • Simple macros for selecting input files,
    reference files, browsing, plotting
  • Provides interactive features suitable for
    development environment for people who are not
    complete experts
  • Doesnt exist yet in any central way, although
    versions exist in the subdetector communities

13
Good/bad run/LB flagging (I)
  • DQ granularity at detector and physics level
  • Desirable to have DQ status available at detector
    and physics levels. Eg
  • Good LAr, barrel pixel,SCT, ...
  • Good ETMiss, barrel b-tagging, ...
  • Imagine data traffic lights
  • greedgood, yellowquestionable, redbad
  • No more than course detector region granularity
    (barrel/endcap)
  • Implemented in separate CondDB folders depending
    on where it comes from
  • DCS, TDAQ
  • calibration/alignment
  • offline reconstruction

14
Good/bad run/LB flagging (II)
  • DQ granularity at time level needs discussion.
    A possible proposal (Hawkings)
  • IoV in natural granularity of each CondDB folder
  • DCS time stamps
  • DAQ/reco/calibration event stamps
  • A CondDB folder indexed by Luminosity Block
    number would contain the combined status
    information contained in the above folders. This
    folder with LB granularity would be used for
    analysis requiring the integrated luminosity
    determination
  • These folders should have versioning capabilities
    to cope with the time varying DQ assessment of a
    given data set
  • NB in above scheme individual events have a
    status beyond just their corresponding luminosity
    block. Could ask an event for
  • Its own data quality status
  • The data quality status of its luminosity block

15
Good/bad run/LB flagging (III)
  • For event selection at TAG or AOD level
  • Users could still check event status in CondDB
  • Advantage best known status available
  • Disadvantage CondDB must be accessible for all
    analysis
  • Status could be copied into AOD when they are
    created
  • Advantage very simple to use and always
    available
  • Disadvantage status will change with time
  • Status could be copied into TAG DB as a new table
  • Eg copy status from CondDB for each new CondDB
    tag
  • Allows reproducible queries (CondDB tag part of
    query)
  • Disadvantage TAG queries must be redone with new
    CondDB tags
  • best known status may not available in TAG DB
    queries

16
Summary/comments
  • Central Data Quality group starting now
  • Currently trying to understand infrastructure
    issues across the entire online production
    analysis chain
  • Real DQ group mandate is primarily to integrate
    subsystem, combined performance and physics
    efforts into a coordinated umbrella
  • Define checks required to assign data quality
    status
  • Develop and integrate tools for DQA
  • Next steps
  • Work with online and offline teams to fully
    define and implement missing infrastructure
  • Build team of DQ experts from systems and physics
    groups
Write a Comment
User Comments (0)
About PowerShow.com