Data Access Layer Servers - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

Data Access Layer Servers

Description:

used for cutout/mosaic services to specify image to be generated ... native, archival, cutout, filtered, mosaic, projection, spectral extraction, ... – PowerPoint PPT presentation

Number of Views:77
Avg rating:3.0/5.0
Slides: 39
Provided by: Krista63
Learn more at: http://www.us-vo.org
Category:
Tags: access | data | layer | servers

less

Transcript and Presenter's Notes

Title: Data Access Layer Servers


1
US National Virtual Observatory
Data Access Layer Servers NVO Summer School,
Aspen Sept. 2006
Doug Tody (NRAO)
2
Data Access Layer (DAL) Services
  • Goals
  • Understand what the DAL services are
  • and what is involved to implement them
  • Agenda
  • Review current and planned DAL services
  • Introduce options/issues faced in implementing
    the DAL services

3
Current and Planned DAL Services
  • Dataset Generic dataset, complex data
    aggregates and associations (proposed)
  • Cone (SCS) Catalog data (released)
  • SIAP V1.0 Image data (released)
  • SSAP 1D Spectra (near PR 2nd gen DAL
    prototype)
  • SLAP Spectral line lists (near PR)
  • STAP Table/Catalog access (proposed)
  • SSAP followon Spectral Energy Distributions
    (SEDs)
  • SSAP followon Time series
  • SIAP V2.0 Major upgrade - cube data etc.
  • SNAP Numerical Models / Theory data

4
Major elements of a DAL service
  • Discovery query (queryData)
  • Discover data matching query
  • Access metadata ("headers") for candidate
    datasets
  • Negotiate contract for virtual data generation
  • This is a web/database type operation
  • Data access (getData acref URL)
  • Retrieve selected datasets (URL-based)
  • May be archival data, or virtual data computed on
    the fly
  • In general dataset may be computed, like a CGI
    web page
  • This is numerical/scientific computing
    type operation
  • Interface
  • RESTful only parameter based currently available
  • Syntax-based query (ADQL/SQL) will be added as
    option
  • SOAP will be added but RESTful interface will be
    retained

5
Simple Cone Search
  • Summary
  • Simplest possible access to astronomical catalogs
  • By far the most widely implemented VO data
    service
  • Prototypical DAL service
  • Query Parameters
  • RA, DEC Position on the sky (J2000,
    DDEG)
  • SR Search radius (DDEG)
  • VERB Verbosity (levels 1-3, optional)
  • Query Response
  • VOTable UCDs describe columns

6
Simple Image Access (SIA V1.0)
  • Summary
  • Uniform access to 2 dimensional images
  • Basically 2-D, but data model and interface are
    more general
  • Same service profile as Cone, but adds getData
  • The query is now used for data discovery instead
    of data access as for Cone data access is a
    separate operation
  • Prototype for 2nd generation DAL interfaces
  • Data models, multiple output formats,virtual
    data generation, etc.

7
SIA Concepts
  • Types of Services
  • Atlas Precomputed survey image (entire image)
  • Pointed Image from pointed observation (entire
    image)
  • Cutout Cutout existing image (pixels unchanged)
  • Mosaic Reprojected image (pixels resampled)
  • Virtual Data
  • Data model mediation
  • Subsetting, filtering, transformation, etc. on
    the fly
  • Possible to view same data in different ways
  • SIA data model is the familiar "astronomical
    image"
  • Generally this means a 2D sky projection, but
    cubes too
  • Data array is logically a regular grid of pixels
  • Encoded as a FITS image, GIF/JPEG, etc.

8
SIA Input Parameters
  • Required parameters
  • POS center of ROI (ra, dec decimal degrees ICRS)
  • SIZE width or width, height
  • FORMAT ALL, GRAPHIC, image/fits, image/jpeg,
    text/html, FORMATmetadata returns service
    metadata
  • Optional parameters
  • INTERSECT values covers, enclosed, center,
    overlaps
  • VERB table verbosity
  • Service-defined parameters
  • used to further refine queries, but not yet
    standardized
  • e.g., BAND, SURVEY, etc.
  • Image generation parameters
  • NAXIS, CFRAME, EQUINOX, CRPIX, CRVAL, CDELT,
    ROTANG, PROJ
  • used for cutout/mosaic services to specify image
    to be generated

9
SIA Query Response
  • Output is a VOTable
  • Must contain a RESOURCE element with
    tag"results", containing the results of the
    query.
  • The results resource contains a single table
  • Each row of the table describes a single data
    object which can be retrieved.
  • The fields of the table describe the attributes
    of the dataset
  • These are the attributes of the SIA data model
  • In SIA 1.0, the UCD is used to identify the data
    model attribute
  • e.g., POS_EQ_RA_MAIN, VOXImage_Scale, etc.

10
SIA Query Response
  • Image metadata
  • Describes the image object (required)
  • Coordinate system metadata
  • Image WCS
  • Spectral bandpass metadata
  • Prototype data model describing spectral bandpass
    of image
  • Processing metadata
  • Tells whether the service modified the image
    data
  • Access metadata
  • Tells client how to access the dataset
    (required)
  • Resource-specific metadata
  • Additional optional service-defined metadata
    describing image

11
SIA Image Metadata (UCDs)
  • VOXImage_Title Brief description of image
  • POS_EQ_RA_MAIN Ra (ICRS)
  • POS_EQ_DEC_MAIN Dec (ICRS)
  • INST_ID Instrument name
  • VOXImage_MJDateObs MJD of observation
  • VOXImage_Naxes Number of image axes
  • VOXImage_Naxis Length of each axis
  • VOXImage_Scale Image scale, deg/pix
  • VOXImage_Format Image file format

12
(No Transcript)
13
Image Retrieval
  • Retrieval is optional
  • Typically only a fraction of the available images
    are retrieved
  • Based on query response
  • If an access reference is provided, the data can
    be retrieved
  • SIAP can also be used to describe data which is
    not online
  • The same data may be available in multiple
    formats
  • Image retrieval
  • Very simple access reference is a URL
  • Standard tools can be used to fetch the data
  • (browser, wget, curl, i/o library, etc.)
  • Data is often computed on-the-fly
  • All retrieval is synchronous (currently)
  • No provision for restricting access (currently)

14
Simple Spectral Access (SSA)
  • Summary
  • Uniform access to 1-D spectra
  • Can also handle spectral aggregates via
    association
  • Support for SEDs and time series will be added
  • First of the 2nd generation DAL interfaces
  • Basic approach does not change (queryData,
    getData)
  • Query interface and metadata are generalized
  • SIA upgrade (etc.) will share the same basic
    interface
  • Includes a standard data model for spectral
    datasets
  • Needed, as there is no standard way to represent
    spectra
  • Standard serializations are defined (VOTable,
    FITS, etc.)
  • Returned data is typically generated on the fly
  • External stored spectra may be in any form

15
SSA Interface Overview
  • Service Operations
  • queryData Discovery query
  • (getData) URL-based currently, as
    for SIA
  • (stageData) Reserved used to
    asynchronously stage data
  • getCapabilities Query service metadata and
    capabilities
  • Complexity
  • Basic usage is quite simple
  • queryData examine VOTable
  • fetch data by access reference URL
  • Basic Spectrum object
  • general metadata ("header")
  • spectral coordinate vector
  • flux vector
  • optional error vector
  • Formats
  • VOTable, FITS, XML, etc. user or service choice

16
SSA Query Interface
  • Mandatory query parameters
  • POS X, Y, FRAME (ICRS)
  • SIZE diameter (decimal degrees)
  • BAND spectral region (1-2 num or name)
  • TIME date1/date2 (ISO8601)
  • FORMAT VOTable, FITS, XML, text, graphics,
    html, native

17
SSA Query Interface
  • Optional query parameters
  • specres minimum spectral resolution (L/dL)
  • spatres minimum spatial resolution (DDEG)
  • timeres minimum time resolution (seconds)
  • SNR minimum SNR
  • redshift redshift interval (1-2 decimal values)
  • targetname target name, e.g., "mars"
  • targetclass target class, e.g., star, QSO, AGN,
    etc.

18
SSA Query Interface
  • Optional query parameters
  • pubDID publisherID string
  • creatorDID creatorID string
  • collection collection ID (shortName,
    minimum match)
  • top max top-ranked entries to be
    returned
  • token continuation token for multipage
    querys
  • maxrec maximum records in query
    response
  • mtime create/modify time in given
    range (ISO8601)
  • runid passed on to any other services
  • compress enable compression

19
SSA Query Response
  • Classes of Query Metadata
  • Query Describes the query itself
  • Association Logical associations
    (aggregation)
  • Access Access metadata for data
    retrieval
  • Dataset General dataset metadata (type
    etc.)
  • DataID Dataset identification - what is
    it
  • Curation How data is published and made
    available
  • Target Astronomical target observed, if
    any
  • Derived Derived quantities (SNR,
    redshift, etc.)
  • Char.Coverage Coverage of spatial, spectral,
    time axes
  • Char.Accuracy Calibration, resolution,
    sampling, errors
  • CoordSys Coordinate system reference
    frames (STC)

20
SSA Query Response
  • Query Metadata
  • Query.Score Degree of match to query params
  • Query.Token Step through large query
    response
  • Association Metadata
  • Association.Type Type of association
  • Association.ID Instance ID linking
    associated records
  • Association.Key Unique key identifying each
    member
  • Access Metadata
  • Access.Reference URL of data product to be
    retrieved
  • Access.ServiceDID DataID of virtual data
    product
  • Access.Format MIME type of dataset
  • Access.Size approximate dataset size
    (bytes)

21
SSA Query Response
  • DataID - Dataset Identification Metadata
  • DataID.Title One-line description of
    dataset (String)
  • DataID.Collection Collection name
    (shortName)
  • DataID.Creator Creator of dataset
    (String)
  • DataID.CreatorID Identifier for VO Creator
    (URI)
  • DataID.CreatorDID Dataset ID assigned by
    creator (URI)
  • DataID.CreatorLogo URL for Creator logo (URI)
  • DataID.Contributor Contributor (may be
    multiple instances)
  • DataID.Date Date last modified (ISO
    Date string)
  • DataID.Version Version of dataset
    instance (String)
  • DataID.Instrument Instrument description
    (String)
  • DataID.Bandpass Spectral bandpass, e.g.,
    filter (String)
  • DataID.DataSource Original source of data
    (String)
  • DataID.CreationType How was dataset created
    (String)

22
Some SSA Concepts
  • DataSource
  • survey, pointed, theory, artificial
  • CreationType
  • native, archival, cutout, filtered, mosaic,
    projection, spectral extraction, catalog
    extraction, etc.
  • Provenance
  • Where did this data come from?
  • especially important for virtual data generated
    by service
  • DataID (Collection, CreatorDID, etc.) refers to
    original data
  • Curation (PublisherDID etc.) refer to data from
    service
  • CreationType indicates how the data was derived

23
Some SSA Concepts
  • Associations
  • Use association metadata to link related records
    (datasets)
  • An association is a complex dataset
  • Data Models
  • Data models formalize the content of data or
    metadata
  • Container/component architecture
  • Component data models aggregated in a container
    and associated logically (similar to a relational
    database)
  • Dataset, Spectrum, Characterization, STC, etc.
  • Characterization
  • Physically characterize the data
  • Spatial, spectral, and temporal axes
  • Coverage, sampling, resolution, accuracy
  • Applies to any dataset (not specific to spectra)

24
(No Transcript)
25
SIA Upgrade Preview (SIA V2.0)
  • Main objectives
  • Upgrade metadata, query interface as for SSA
  • standard generic dataset metadata
  • more powerful query interface
  • more comprehensive output metadat
  • Precision image data access enhancements
  • e.g., cube data, image slicing, projection,
    filtering
  • (TBD whether this is folded into basic SIA or
    done as a separate service class)
  • Advanced service capabilities
  • versioning, metadata query
  • asynchronous data staging, authentication,
    VOStore integration

26
Cube Data
  • Overview
  • Motivated primarily by radio data surveys (CGPS,
    Arecibo)
  • Many O/IR integral field unit (IFU) instruments
    coming online as well
  • Challenge datasets can be both large and
    complex
  • Large datasets
  • Current data cubes are several hundred MB up to
    several GB
  • Future wide-field wide-band 2048x2048x8192x4
    128 GB
  • With polarization, multiple bands, could have 1/2
    TB datasets!
  • Complex datasets
  • e.g., CGPS HI cube, CO cube, continuum, IQUV,
    IRAS same field
  • Multiple ways to view the same data
  • Multi-band surveys are a simpler example of this
    trend
  • Use-Cases for recent study
  • CGPS, SGPS, GALFA (Arecibo), SINFONI (ESO IFU)

27
(No Transcript)
28
Cube Data
  • Data access considerations
  • Network download of large cubes can be
    impractical
  • VO-style virtual data access to remote data is
    required
  • subsetting, filtering (spectral or time regions),
    transformations (projections, spectrum
    extraction)
  • Strategy iteratively download data subset,
    visualize locally
  • Typical access modes
  • Whole image
  • Spectrum extraction
  • Cutout 2D planes
  • Cutout 3D sub-cube (permits local full 3D
    analysis)
  • 2D projection along one axis
  • 3D projection (general 3D transformation)
  • 2D slice through 3D cube at arbitrary 3D
    pos,orientation

29
Cube Data
  • Typical access scenario
  • Discovery query to discover data, get access
    metadata
  • Access query to set up virtual data access (WCS
    based)
  • Data access, dynamically generating virtual data
  • Repeat for a different region or view
  • Example Compute 2D projection with spectral
    filtering
  • View 2D preview or projection, e.g., continuum
  • Extract 1D spectra in sky regions (SSA with
    synthetic aperture)
  • Analyze sky spectrum to determine night sky lines
    (SLAP)
  • Compute 2D projection of cube excluding sky
    emission, absorption
  • Other examples
  • Extract 3D sub-cube for full 3D analysis locally
  • 2D slice at arbitrary position and orientation

30
Cube Examples
  • Extract 2-D plane from cube, same orientation
  • queryData
  • PubDIDltdesired cube datasetgt
  • POSltcenter of 2-D planegt
  • SIZEltspatial extent of 2-D planegt
  • (cutout of smaller region also possible here)
  • BANDltspectral-coord of desired planegt
  • NAXES2
  • FORMATFITS

31
Cube Examples
  • 2-D Projection with spectral filtering
  • queryData
  • PubDIDltdesired cube datasetgt
  • POSltcenter of 2-D planegt
  • SIZEltspatial extent of 2-D planegt
  • (cutout of smaller region also possible here)
  • BANDltrange-list of good spectral regionsgt
  • NAXES2
  • FORMATFITS
  • (in SINFONI case original cube is in Euro-3D
    format)

32
Cube Examples
  • Extract 3-D Sub-Cube
  • queryData
  • PubDIDltdesired cube datasetgt
  • POSltspatial center of regiongt
  • SIZEltspatial extent of sub-cubegt
  • BAND3.45E-7/8.76E-6
  • NAXES3
  • FORMATFITS

33
Implementing DAL Services
  • Overall Process
  • Determine what subclass of service to implement
  • do we return whole files, cutouts, extract
    spectra, etc.?
  • Select service technology
  • Java, dotNet/Mono, Ruby, etc.
  • Implement
  • Reference code or a template would be useful here
  • Test
  • Service verification tools
  • Register
  • As soon as you do this you are online!

34
Cone Search
  • queryData operation
  • SQL select operation on a RDBMS
  • Transform output into VOTable format
  • a VOTable package can be useful here
  • Issues
  • May need to assign UCDs to your catalog fields

35
Simple Image Access
  • queryData operation
  • Select operation on a RDBMS
  • Compute SIA query response metadata
  • Transform output into VOTable format
  • Issues
  • Computing the SIA query response metadata can be
    nontrivial
  • e.g., for a cutout or mosaic
  • don't forget you should return WCS information
  • Metadata generation
  • This is much easier if image metadata is cached
    in DBMS
  • For virtual data must compose access reference
    command

36
Simple Image Access (contd)
  • getData operation
  • Atlas, Pointed
  • only input is an access URL pointing to the file
  • return FITS file
  • Cutout, Mosaic
  • access URL is the command which generates the
    virtual data
  • may require significant, complex computation!
  • getCapabilities
  • For SIA V1.0 this is FORMATmetadata
  • Tells client service capabilities and any
    optional parameters

37
Implementing DAL Services
  • Web Service Frameworks
  • LAMP - Linux, Apache, MySQL, Python/Perl/PHP etc.
  • Apache Web server, Tomcat, Java servlets
  • dotNET/Mono
  • Microsoft approach SQL server, C
  • Ruby on Rails
  • Trendy new alternative
  • Virtual Data Generation
  • Backend may require significant computation
  • Re-use some science package (IRAF, IDL, AIPS,
    CASA, etc.)
  • Or at least CFITSIO, WCSTOOLS, and other
    libraries

38
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com