Data Access Layer - PowerPoint PPT Presentation

1 / 44
About This Presentation
Title:

Data Access Layer

Description:

used for cutout/mosaic services to specify image to be generated ... Cutout Subset, data samples are not modified. Resampled Subset, data samples computed by service ... – PowerPoint PPT presentation

Number of Views:166
Avg rating:3.0/5.0
Slides: 45
Provided by: Krista63
Learn more at: http://www.us-vo.org
Category:
Tags: access | data | layer

less

Transcript and Presenter's Notes

Title: Data Access Layer


1
US NATIONAL VIRTUAL OBSERVATORY
Data Access Layer
Doug Tody (NRAO)
2
Data Access Layer
  • What does it do?
  • Provides access to data
  • data discovery
  • mediation to a standard model
  • data retrieval
  • on-demand data generation
  • server-side computation (subsetting, filtering)
  • What is it for?
  • Supports client data analysis
  • distributed, multiwavelength
  • How does it work?
  • Object (dataset) oriented
  • catalog, image, spectrum, time series, SED, etc.
  • Services
  • cone search (also SkyNode), SIA, SSA

3
Cone Search
4
Cone Search
  • Provides basic catalog access
  • Query by position and aperture (cone in space)
  • Query consists of base-URL (service endpoint)
    plus parameters
  • e.g., http//base-url RA12.0DEC0.0SR1.0
  • Catalog returned as a VOTable
  • Advantages
  • Simple but powerful, provides standard interface
  • Easy to implement and use
  • Limitations
  • Catalog metadata is not defined
  • No data model support
  • Future
  • Supplanted by basic SkyNode (Greene, Saturday)
  • Supports metadata discovery, SQL-like syntactical
    queries
  • We will continue to support the basic cone search
    query however!

5
Simple Image Access
6
Simple Image Access (SIA)
  • Basic Usage, Highest Level
  • Client queries Registry to find interesting
    services
  • Each service is queried (in turn or
    simultaneously) for data
  • Client collates and analyzes results
  • Selected datasets are retrieved

7
Simple Image Access (SIA)
  • Basic Usage, Single Service
  • Query
  • find data of interest from a single service
  • http//base-url POS12.0,0.0SIZE0.2FORMATima
    ge/fits
  • Query response
  • VOTable, one row per candidate dataset
  • "access reference" (a URL) points to data
  • Data selection
  • Performed by the client using query response
    metadata
  • Dataset retrieval
  • Retrieve actual datasets, if any

8
Service Capabilities
  • Types of Services
  • Atlas Precomputed survey image (entire image)
  • Pointed Image from pointed observation (entire
    image)
  • Cutout Cutout existing image (pixels unchanged)
  • Mosaic Reprojected image (pixels resampled)
  • Virtual Data
  • Data model mediation
  • Subsetting, filtering, etc. on the fly
  • Possible to view same data in different ways
  • Interface
  • RESTful interface currently (HTTP GET)
  • Document oriented (VOTable, FITS, JPEG, etc.)

9
Data Model
  • SIA data model is the familiar "astronomical
    image"
  • Generally this means a 2D sky projection
  • Data array is logically a regular grid of pixels
  • Encoded as a FITS image, GIF/JPEG, etc.
  • Standardized dataset metadata
  • Provenance
  • Image geometry
  • Scale
  • Format
  • Position, WCS
  • Time of observation
  • Spectral bandpass
  • Access information

10
Input Parameters
  • Required parameters
  • POS center of ROI (ra, dec decimal degrees ICRS)
  • SIZE width or width, height
  • FORMAT ALL, GRAPHIC, image/fits, image/jpeg,
    text/html,
  • Optional parameters
  • INTERSECT values covers, enclosed, center,
    overlaps
  • VERB table verbosity
  • Service-defined parameters
  • used to further refine queries, but not yet
    standardized
  • e.g., BAND, SURVEY, etc.
  • Image generation parameters
  • NAXIS, CFRAME, EQUINOX, CRPIX, CRVAL, CDELT,
    ROTANG, PROJ
  • used for cutout/mosaic services to specify image
    to be generated

11
Query Response
  • Output is a VOTable
  • Must contain a RESOURCE element with
    tag"results", containing the results of the
    query.
  • The results resource contains a single table
  • Each row of the table describes a single data
    object which can be retrieved.
  • The fields of the table describe the attributes
    of the dataset
  • These are the attributes of the SIA data model
  • In SIA 1.0, the UCD is used to identify the data
    model attribute
  • e.g., POS_EQ_RA_MAIN, VOXImage_Scale, etc.

12
Query Response
  • Image metadata
  • Describes the image object (required)
  • Coordinate system metadata
  • Image WCS
  • Spectral bandpass metadata
  • Prototype data model describing spectral bandpass
    of image
  • Processing metadata
  • Tells whether the service modified the image data
  • Access metadata
  • Tells client how to access the dataset (required)
  • Resource-specific metadata
  • Additional optional service-defined metadata
    describing image

13
Image Metadata
  • VOXImage_Title Brief description of image
  • POS_EQ_RA_MAIN Ra (ICRS)
  • POS_EQ_DEC_MAIN Dec (ICRS)
  • INST_ID Instrument name
  • VOXImage_MJDateObs MJD of observation
  • VOXImage_Naxes Number of image axes
  • VOXImage_Naxis Length of each axis
  • VOXImage_Scale Image scale, deg/pix
  • VOXImage_Format Image file format

14
(No Transcript)
15
Image Retrieval
  • Completely optional
  • Typically only a fraction of the available images
    are retrieved
  • Query response
  • If an access reference is provided, the data can
    be retrieved
  • SIAP can also be used to describe data which is
    not online
  • The same data may be available in multiple
    formats
  • Image retrieval
  • Very simple access reference is a URL
  • Standard tools can be used to fetch the data
  • (browser, wget, curl, i/o library, etc.)
  • Data is often computed on-the-fly
  • All retrieval is synchronous (currently)
  • No provision for restricting access (currently)

16
Service Registration

17
Future Development
  • SIA V1.1
  • Based on work done on SSA
  • Expanded query interface
  • no longer limited to positional queries
  • Much richer query response
  • generic dataset identification, characterization,
    etc.
  • metadata extension mechanism
  • Selected features
  • VOTable 1.1 with UCD 1, GROUP, UTYPE
  • query response can be ordered by "score"
  • logical groupings of related query records
  • compression support
  • Versioning
  • required to make protocol upgrades manageable

18
(No Transcript)
19
(No Transcript)
20
Future Development
  • Service verification
  • for testing at development time
  • when registered level of compliance metric
  • Grid capabilities
  • Data staging
  • asynchronous image generation (long running jobs)
  • batch generation of images (multiple images)
  • Data management
  • support for single sign-on authentication,
    authorization
  • network data caching, third party delivery
    (VOStore etc.)
  • Web service interface
  • resource metadata
  • service availability (etc.)
  • ADQL integration
  • Capability to use query language for queries

21
Simple Spectral Access
22
Simple Spectral Access (SSA)
  • What is it?
  • Provides access to 1D spectra, time series, SEDs
  • Tabular spectrophotometric data (photometry
    points)
  • Represents second generation, data model-based
    DAL interfaces
  • Status
  • Draft V0.9 query interface reviewed in Kyoto (May
    05)
  • Revisions in progress draft PR targeted for
    Madrid (Oct 05)
  • Much work on data models however still being
    revised
  • Some initial prototypes already exist (services,
    client apps)
  • IVOA/Madrid discussions will be held immediately
    after the ADASS and are open to all

23
Basic Usage
  • SSA specification may be complex, but basic usage
    is simple
  • Simple query
  • POS, SIZE, FORMAT - like cone search, SIA
  • Possibly refined by spectral or time bandpass,
    etc.
  • Most metadata in query response is optional
  • Data retrieval
  • Simple retrieval is again URL-based
  • Get back a dataset "document" (VOTable, FITS,
    JPEG, etc.)
  • In simplest case could be wavelength, flux as
    text (for Spectrum)
  • Pass-through of external data is permitted
  • Data Analysis
  • Standard data model isolates application from
    quirks of
  • external project data

24
Concepts - Dataset-oriented
  • Data object type
  • Spectrum, TimeSeries, SED
  • Dataset creation type
  • Atlas Whole datasets, uniform survey data
  • Pointed Whole datasets, variable
    instrumental data
  • Cutout Subset, data samples are not
    modified
  • Resampled Subset, data samples computed by
    service
  • Dataset derivation
  • Observed An observation
  • Composite Combination of several
    observations
  • Simulated Simulated observation made from
    real data
  • Synthetic Data from a theoretical model

25
Data Models
  • Data models used in SSA
  • Spectral data Spectrum, TimeSeries, SED
  • Dataset Generic dataset descriptor
  • Target Astronomical target observed
  • Curation Origin of data
  • Characterization Physical characteristics of data
  • Provenance Instrument which generated the data
  • User defined data models
  • Metadata extension mechanisms
  • additional data model attributes (table fields)
  • additional resources in VOTable, linked back to
    main table
  • Provide a mechanism to "subclass" dataset to
    tailor it for a given data collection

26
Spectral Data (SED)
Photometry point
spectrum segment
27
Spectral/SED Data Model
28
(No Transcript)
29
Query Interface
  • Mandatory query parameters
  • POS RA, DEC (ICRS)
  • SIZE diameter (decimal degrees)
  • TIME data1,date2 (epoch in decimal
    years UTC)
  • BAND wave1,wave2 (meters in vacuum source or
    observer)
  • FORMAT VOTable, fits, xml, text, graphics,
    html, external

30
Query Interface
  • Recommended query parameters
  • APERTURE approx spatial resolution
    (decimal degrees)
  • SPECRES spectral resolution (meters)
  • TOP number of top-ranked records
    to return
  • OBJTYPE mandatory if service returns
    multiple object types
  • COLLECTION data collection identifier

31
Query Interface
  • Optional parameters
  • CREATORID creator-assigned dataset
    identifier (at most 1)
  • PUBID publisher-assigned dataset identifier (at
    most N)
  • COMPRESS enable compression (for both data _and_
    queries?)
  • SNR signal-to-noise ratio
  • REDSHIFT redshift range (dlambda/lambda)
  • TARGETCLASS star, galaxy, pulsar, PN, QSO, AGN,
    etc.

32
Query Response
  • Classes of query metadata
  • Query metadata Describes the query itself
  • Dataset metadata Describes data object
    object-specific
  • Target metadata Astronomical target
  • Curation metadata External identification
    of dataset
  • Characterization Coverage, Accuracy,
    Frame, etc.
  • Instrument metadata Service-defined hard to
    standardize
  • Access metadata Describes how to access
    the dataset

33
Query Response
  • Query Metadata
  • Query.Score How well object matches query
  • Query.LName Logical name (identifier)
  • Query.LNameKey Logical name key (id-ref)
  • Example LName"MyObj123" LNameKey"server,forma
    t"

34
Query Response
  • Dataset Metadata
  • Dataset.Type Spectrum, TimeSeries,
    SED, etc.
  • Dataset.DataModel DM name, e.g.,
    "SSA-V0.90"
  • Dataset.Title Brief descriptive title
    of dataset
  • Dataset.SSA.NSamples Total samples in dataset
    Dataset.SSA.Aperture Characteristic aperture
    diameter
  • Dataset.SSA.TimeAxis TimeCoord axis (external
    data)
  • .SSA.SpectralAxis SpectralCoord axis
    (external data)
  • Dataset.SSA.FluxAxis Flux axis (external data)
  • Dataset.CreationType atlas, pointed, cutout,
    resampled
  • Dataset.Derivation observed, composite,
    simulated, synthetic

35
Query Response
  • Target Metadata
  • Target.Name Name of astronomical object
  • Target.Class Target class (star,
    galaxy, QSO, etc.)
  • Target.SpectralClass Spectral class (e.g., 'O',
    'B', etc.)
  • Target.Redshift Nominal redshift for
    object
  • Derived.VarAmpl Variability amplitude
    (fraction 0-1)
  • Derived.SNR Observed signal to noise
    ratio

36
Query Response
  • Curation Metadata
  • Curation.Collection Data collection name
    (identifier)
  • Curation.Creator Creator identify
    (identifier)
  • Curation.CreatorID Creator-assigned dataset
    identifier
  • Curation.PublisherID Publisher-assigned
    dataset identifier
  • Curation.Date Dataset creation date
    (ISO date string)
  • Curation.Version Dataset version (within
    same ID)

37
Query Response
  • Characterization1 - Coverage
  • .Location.Spatial Position (e.g., RA, DEC)
  • .Location.Time Observation time
    characteristic value
  • .Location.Spectral Spectral bandpass
    characteristic value
  • .Location.Spectral.BandID Bandpass ID (band
    or filter name)
  • .Bounds.Spatial Aperture footprint
    (polygon on sky)
  • .Bounds.Time Low/High time values
  • .Bounds.Spectral Low/High spectral values
  • .Bounds.Flux Limiting flux,
    saturation limit (Jansky)
  • .Fill.Spatial Spatial sampling filling
    factor (0-1)
  • .Fill.Time Time sampling filling
    factor (0-1)
  • .Fill.Spectral Spectral sampling
    filling factor (0-1)

38
Query Response
  • Characterization2 - Accuracy
  • Accuracy..Calibrated uncalibrated, relative,
    absolute
  • Accuracy..Resolution Resolution of measured
    signal
  • Accuracy..StatErr Statistical error
    (measured)
  • Accuracy..SysErr Systematic error
    (estimated)
  • ('' Spatial, Time, Spectral, Flux)

39
Query Response
  • Characterization3 - Reference Frames
  • Frame.Spatial.Type Coordinate frame (default
    ICRS)
  • Frame.Spatial.Equinox Coordinate system equinox
    (J2000)
  • Frame.Time.System Timescale (TT)
  • Frame.Time.SIDim SI factor and dimension
  • Frame.Spectral.SIDim SI factor and dimension
  • Frame.Flux.SIDim SI factor and dimension
  • Frame.Flux.UCD UCD of flux value (flux
    type)
  • (These apply only to the query response)
  • (SIDim metadata still under construction)

40
Query Response
  • Instrument Metadata
  • Instrument.Name Instrument name (identifier)
  • Instrument.Exposure Total exposure time
    (seconds)
  • Instrument.ltothergt Service-defined
  • Notes
  • Optional provided for instrumental data
    collections
  • In general, Collection, Bounds.Time, etc. are
    preferred
  • In general Instrument metadata is service-defined
  • Use Observation model as a starting point

41
Query Response
  • Access Metadata
  • Access.Reference Data access URL
  • Access.Format MIME type of returned
    dataset
  • Access.Size Approximate dataset size
    (bytes)
  • Access.Server Server endpoint URL
  • Staging support goes here in the future
  • e.g., will dataset access require asynchronous
    staging
  • estimated cost to construct dataset

42
Service Metadata
  • Usage
  • Describe service type and capabilities
  • Characterize service (data resources served,
    coverage, etc.)
  • Describe interface (optional query parameters)
  • Interface
  • Requires new service metadata query method
  • Returns resource metadata descriptor (XML)
  • Format
  • Registry resource descriptor (XML)

43
Data Retrieval
  • Based on GET as with SIA
  • Variety of formats available
  • Compression supported
  • Data representation
  • Data model defines logical content of data
  • The same data object may be represented in
    various formats
  • Hence we need to specify both the data model, and
    the file format

44
Data Retrieval
  • Data models
  • SSA data model for fully-compliant data
  • Provider-defined data model for external data
  • Data formats
  • VOTable (a container), native XML (direct
    serialization)
  • FITS binary table (another container uses FITS
    spectral WCS)
  • Text, e.g., CSV
  • Graphics (JPEG etc.)
  • text/html (rendered into browser page)
Write a Comment
User Comments (0)
About PowerShow.com