Nonbio necro sciences Jim Frew, Bob Mann - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

Nonbio necro sciences Jim Frew, Bob Mann

Description:

Chicago Provenance Workshop. Non-bio (necro-?) sciences (Jim Frew, Bob Mann) ... Chicago Provenance Workshop. Evolution in astronomical practice ' ... – PowerPoint PPT presentation

Number of Views:54
Avg rating:3.0/5.0
Slides: 18
Provided by: peopleCs
Category:
Tags: bob | chicago | frew | jim | mann | necro | nonbio | sciences

less

Transcript and Presenter's Notes

Title: Nonbio necro sciences Jim Frew, Bob Mann


1
Non-bio (necro-?) sciences(Jim Frew, Bob Mann)
  • Examples of current practice and issues
  • Astronomy Bob Mann, Alex Szalay
  • Earth Sciences Jim Frew, Dave Maier
  • Others
  • Draw up list of issues
  • Discussion

2
Some provenance data derivation issues in
astronomy
  • Bob Mann
  • Institute for Astronomy, Edinburgh Univ.
  • National e-Science Centre

3
Outline
  • Trends in astronomy implications for provenance
  • Two provenance issues
  • Recording provenance in the FITS data format
  • Provenance in database federation
  • Alex Szalay
  • Provenances in pipelines and databases
  • Annotations in astronomy databases

4
Evolution in astronomical practice
  • Collectivisation the empowerment of the
    individual
  • Fewer individual observational programmes and
    more sky surveys
  • More people access the data, via archives
  • The specialist is dead, long live the
    generalist
  • Use multi-wavelength data
  • Expertise in classes of astronomical object, not
    observational techniques

5
Implications for provenance
  • More science being done with data that the
    individual scientist didnt take
  • about which the scientist knows less
  • More reliance on pipeline processing
  • More science with catalogues of source attributes
    derived from primary data
  • More science being done through combining data
    from multiple sources more later

6
FITSFlexible Image Transport System
  • Format of a FITS file (http//fits.gsfc.nasa.gov)
  • Primary Header metadata describing instrument,
    observation file contents
  • Primary Data Array array of 0-999 dimensions
    usually a 2D image
  • none or more Extensions
  • Array, ASCII Table or Binary Table, each with
    Header
  • (New FITS-inspired XML format VOTable)

7
FITS header entries
  • Keyword-value pairs optional comment
  • e.g. PLTSCALE '67.14 ' / arcsec/mm
    plate scale
  • Three types of header keyword
  • Mandatory e.g. NAXIS
  • Optional e.g. DATAMAX
  • Additional i.e. user-defined, but not from
    restricted list (mandatory optional)

8
Provenance in FITS headers
  • Many optional keywords related to provenance
  • ORIGIN, DATE-OBS, TELESCOP, INSTRUME, OBSERVER,
    REFERENC
  • plus HISTORY The text should contain a history
    of steps and procedures associated with the
    processing of the associated data. Any number of
    HISTORY card images may appear in a header.
    (FITS Standard)

9
Example FITS header extracts (1)
  • SIMPLE T / file does
    conform to FITS standard
  • BITPIX 32 / number of bits
    per data pixel
  • NAXIS 2 / number of data
    axes
  • NAXIS1 648 / length of data
    axis 1
  • NAXIS2 648 / length of data
    axis 2
  • EXTEND T / FITS dataset may
    contain extensions
  • BUNIT 'Primary Array' / Units of the
    image
  • XPROC0 'evselect table''product/P0059750201PNU
    002PIEVLI0000.FITEVENTS'' w
  • CONTINUE 'ithfilteredsetno filteredset''filtere
    d.fits'' keepfilteroutputno
  • CONTINUE ' destructyes flagcolumn''EVFLAG''
    flagbit-1 filtertype''expres
  • CONTINUE 'sion'' expression''GTI(intermediate/Gl
    obalHK-all-1-Attitude_GTI-X0
  • CONTINUE '000000000.fits, TIME)
    GTI(intermediate/pnEvents-epn-1-EPIC_flare
  • CONTINUE '_GTI-U0020000000.fitsSTDGTI, TIME)
    (RAWY12) (PATTERN
  • CONTINUE ' (PI in (20012000) (PI500
    (PI
  • CONTINUE 'ATTERN0)) (FLAG 0x2fa0024)
    0'' dssblock'''' writedssyes
  • CONTINUE ' cleandssno updateexposureyes
    filterexposureyes blockstocopy'''
  • CONTINUE ''' attributestocopy''''
    energycolumn''PHA'' withzcolumnno zcolu

New Keyword
Multi-line entry
10
Example FITS header extracts (2)
End of header entries generated at telescope
  • XTENSION 'IMAGE ' / Image extension
  • BITPIX 16 / Bits per pixel
  • NAXIS 2 / Number of axes
  • HISTORY This is the end of the header written by
    the ING observing-system.
  • WAT0_001 'systemimage'
  • WAT1_001 'wtypezpx axtypera projp11.0
    projp3220.0'
  • WAT2_001 'wtypezpx axtypedec projp11.0
    projp3220.0'
  • TRIM 'Sep 2 1614 Trim data section is
    512098,14100'
  • BP-FLAG 'Sep 2 1614 Bad pixel file is
    /home/jrl/wfcred/stds/A5506-4.bad'
  • BT-FLAG 'Sep 2 1614 Overscan section is
    150,14128 with mean1514.871'
  • BI-FLAG 'Sep 2 1614 Zero level correction
    image is /data/cass03a/was/mframe
  • FF-FLAG 'Sep 2 1614 Flat field image is
    /data/cass03d/was/mframes/r_9362689
  • ILLUMCOR 'Sep 2 1614 Illumination image is
    tmpill.pl with scale0.9655418'

Keywords describing data reduction process
11
Example FITS header extracts (3)
House-keeping provenance metadata
  • SIMPLE T / file does
    conform to FITS standard
  • BITPIX 16 / number of bits
    per data pixel
  • NHKLINES 146 / Number of lines
    from house-keeping file
  • HKLIN001 'JOB.JOBNO UKJ349' /
  • HKLIN002 'JOB.DATE-MES 19980929' /
  • HISTORY 'SuperCOSMOS image analysis and mapping
    mode (IAM and MM)' /
  • HISTORY 'data written by xydcomp_ss.' /
  • HISTORY 'Any questions/comments/suggestions/bug
    reports should be sent' /
  • HISTORY 'to N.Hambly_at_roe.ac.uk' /

12
FITS provenance - summary
  • Header keywords designed for recording provenance
    information esp. HISTORY
  • HISTORY cards written in free text not readily
    machine-interpretable
  • Project-specific provenance keywords not readily
    interpretable at all outside project

13
Provenance in database federation
  • Sky survey databases in many wavebands
  • New science from federating them
  • Need to associate entries in different DBs
  • Unified Column Descriptors (UCDs)
  • Taxonomy based on collation of column names from
    hundreds of databases
  • Location on sky provides natural indexing

14
Matching by proximity not always adequate
Need to know more about astrophysical properties
of two populations to know which of the red
objects is the most likely counterpart to the
cyan source
15
Recording association provenance
  • Might want to record associations in DBs
  • Users want to know whether to trust them
  • Complex probabilistic association algorithms
  • Difficult to describe easily
  • Associations may change in light of new data
  • Can users challenge them via annotation?

16
Summary
  • Astronomers record lots of provenance info
  • Want machine-interpretability
  • Some astronomical provenance is complex
  • Want means of describing algorithms
  • Starting to get links between databases and
    online copies of scientific papers
  • No culture of annotation by users - yet

17
Write a Comment
User Comments (0)
About PowerShow.com