Automatic Data Quality Monitoring in the BaBar Online and Offline Systems - PowerPoint PPT Presentation

About This Presentation
Title:

Automatic Data Quality Monitoring in the BaBar Online and Offline Systems

Description:

California Institute of Technology. For the BaBar Computing ... Inconsistent checking of data is common depending on staffing. Automatic monitoring provides: ... – PowerPoint PPT presentation

Number of Views:56
Avg rating:3.0/5.0
Slides: 17
Provided by: chep200
Category:

less

Transcript and Presenter's Notes

Title: Automatic Data Quality Monitoring in the BaBar Online and Offline Systems


1
Automatic Data Quality Monitoring in the BaBar
Online and Offline Systems
  • Scott D. Metzler
  • California Institute of Technology
  • For the BaBar Computing Group

2
Context
  • Online System
  • 32 Sun Ultra-5 workstations
  • 2000 Hz maximum input rate into Level 3
  • Level 3 reduces the rate to 100 Hz.
  • Real-time monitoring is a system requirement.
  • Diagnostic Data
  • Same set of diagnostic data is produced on all 32
    nodes.
  • Data summed over all nodes is available to GUIs
    and automatic monitor.
  • Multiple levels of monitoring are available.

3
Need for Automation
  • Thousands of diagnostic objects are produced for
    each run.
  • These are organized by detector system.
  • Systems typically provide a high-level diagnostic
    page for use within JAS for shift monitoring.
  • Some plots are too subtle for shift crews to
    digest.
  • Inconsistent checking of data is common depending
    on staffing.
  • Automatic monitoring provides
  • consistent checking
  • objective, system-defined tests
  • greater coverage of the detector

4
Diagnostic Data Types
  • Histograms provide time-integrated monitoring of
    system-defined quantities.
  • Three types of histograms are available
  • 1D
  • 2D
  • 1D Profile
  • Histogram contents can be monitored as the total
    sum since the beginning of the run or as the sum
    since the last automatic comparison.

5
Histograms Displayed in JAS
6
Diagnostic Data Types (Cont.)
  • Scalers provide tracking of quantities over time
  • Each scaler has a rotating buffer of time bins.
    The bins are synchronized over nodes.
  • Scaler Groups control the granularity of bins.
  • Four types of scalers are available
  • Averaging (weighted average over time/nodes)
  • Integrating (summed over time/nodes)
  • Value (set over time single node only)
  • Multi (a list of the above types)

7
Scalers Displayed in JAS
New Run
8
Conceptual Design
hbook
Data Retriever
Comparator
Fit
Network
c2
1
2
1
1
Comparison Record
Manager
1
N
1
N
Responses
GUIs
9
Comparison Techniques
  • Fixed Spectrum
  • Compare histograms against a reference histogram
    using Kolmogorov-Smirnof or Chi-Squared testing.
  • Compare individual bins against a reference
    looking for hot or dead channels. A single bad
    bin causes an error.
  • Comparison against parameterized functions is
    available.

10
Comparison Techniques (Cont.)
  • Fitting
  • Fitting is intended to handle histograms that are
    difficult to compare against fixed spectrums
    because of changing conditions.
  • Detector systems define the function to which
    they wish to fit a histogram. They also define
    the allowed ranges of the fitted parameters.
  • It is possible to ignore certain parameters (e.g.
    background fraction) in the comparison.
  • Fitting is not fully available yet, but we
    anticipate that it will be soon.

11
Comparison Techniques (Cont.)
  • Monitoring Scalers
  • Comparison against a fixed range.
  • Comparison as a function of other scalers (e.g.
    luminosity).
  • Scaler comparisons are also not available yet,
    but are anticipated.

12
Responding to Problems
  • The comparison techniques return a value which is
    passed to user-defined responses.
  • The responses are triggered if the comparison
    falls outside of allowed bounds.
  • Systems define the severity of the error based on
    the return value and determine how to respond to
    the error.
  • E-mail
  • Occurrence Logger
  • Multiple responses are possible for a single
    comparison.

13
Graphical Tools
  • The Occurrence Logger gives the shift crew a list
    of potential problems to investigate in
    real-time. Feed-back capabilities are being
    improved.
  • A custom GUI provides control and performance
    information of the automatic monitoring system so
    that it can be tuned.
  • Command-line administration is available for use
    with Run Control.
  • Integration with JAS is a longer-term goal.

14
Error Browser
15
Automatic Monitoring Control
16
Lessons Learned and Conclusions
  • This system requires significant user
    configuration. We would have benefited by
    providing an early prototype to familiarize users
    with what was coming.
  • The system has been shown to be well abstracted
    and extensible.
  • The system is now in production use comparing
    histograms against fixed references.
  • More advanced comparisons are coming soon.
Write a Comment
User Comments (0)
About PowerShow.com