Data analysis by querying - PowerPoint PPT Presentation

1 / 7
About This Presentation
Title:

Data analysis by querying

Description:

Make data public. Throw away data. Context. Organize in ... Responsible - privileged users. New objects are automatically part of the currently active context ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 8
Provided by: astro72
Category:

less

Transcript and Presenter's Notes

Title: Data analysis by querying


1
Data analysis by querying
  • and more
  • Lorentz center, Wednesday, 17 November 2005

2
Python queries
  • Seamless integration of database and data
    reduction
  • DB backend independent
  • Search for raw and reduced calibration and
    science objects
  • Attributes satisfying certain criteria
  • Dependencies satisfying certain criteria
  • Sometimes SQL is better suited
  • Grouping
  • Finding objects on which others depend

3
examples
  • f (Filter.name '841')0
  • m MasterFlatFrame.filter f
  • ReducedScienceFrame.flat m41

4
SQL
  • Oracle 10g
  • Syntax
  • Standards compliant
  • Oracle specific extra's
  • PL/SQL
  • Function and procedures
  • Methods

5
Astro-Wise
  • Persistent Python objects
  • User-defined TYPEs and REFerences
  • Object TABLEs and VIEWs
  • Python class ? User-defined TYPE
  • Python object ? Object TABLE
  • len(RawScienceFrame.imstat.stdev lt 10.0)SELECT
    COUNT()FROM AWOPER."RawScienceFrame" TWHERE
    T."imstat"."stdev" lt 10.0

6
examples
  • Pythonlen(RawScienceFrame.imstat.stdev lt 10.0)
  • Oracle SQLSELECT COUNT()FROM
    AWOPER."RawScienceFrame" TWHERE
    T."imstat"."stdev" lt 10.0
  • SELECT T."imstat"."stdev" x ,
    T."flat"."imstat"."stdev" yFROM
    AWOPER."ReducedScienceFrame" TWHERE
    T."chip"."name"'ccd53' AND T."filter"."name"'
    843'
  • Start with dbview.astro-wise.org

7
Context
  • for projects and MyDB
  • Kapteyn Astronomical Institute, Wednesday, 19
    April 2006

8
Context
  • Handling large amounts of data
  • Work together
  • Work in a consistent way
  • Experiment with source code
  • Know what data and code was used
  • Keep data private
  • Make data public
  • Throw away data

9
Context
  • Organize in Projects, that have
  • A name and description
  • Users
  • Default privileges
  • Responsible - privileged users
  • New objects are automatically part of the
    currently active context
  • Tools to safely delete or publish data
  • Be aware of dependencies for history tracking

10
Context
  • Confusing
  • Setting a project can make more data visible to a
    user
  • Setting an instrument can make less data visible
    to a user
  • Queries are more suited for some uses of context
  • Make it easy to query on user and project from awe

11
Setting and getting projects
  • from astro.database.Context import context
  • context.set_project('WORKSHOP2005')
  • context.set_project('OCAM ILT')
  • context.get_projects()
  • context.get_current_project().name
  • CALL AWOPER.AWSECURITY.SET_PROJECT('OCAM ILT')

12
Migrating MyDB to other Project
  • Everything
  • Toolbox/dbmigrateall.py AWMJACKSON KIDS
  • Single object, including dependencies
  • sl SourceList.name.like('TEST')0
  • context.migrate(sl, 'VESUVIO')

13
Deleting MyDB
  • Delete ALL objects that belong to MyDB project
  • Toolbox/dbdeleteall.py AWMJACKSON
  • Includes files, which take up most space
  • No dependency check required

14
Projects
  • Like MyDB, but for more users.
  • Delete all data that is private to KIDS
  • Toolbox/dbdeleteall.py KIDS
  • Making data public, i.e. update privileges
  • context.update_privileges(obj, 3)

15
Federation
  • Goal
  • Distribute data
  • Distribute queries
  • Avoid copying data
  • Available database technology
  • Non-existent
  • Oracle technology
  • Streams
  • Advanced Replication
  • Datapump

16
Federation
  • Short time-scale this year
  • Datapump, low frequency, 1/day
  • Medium time-scale Q2 next year
  • Advanced Replication
  • Streams
  • Long time-scale - gt Q2 next year
  • Streams
  • Distributed data and querying

17
Code versions
  • Tag code and objects
  • Difficult to know with which code a number was
    exactly calculated
  • Set version tags strategically
  • Objects get version from the module in which
    their class is defined
  • Might be sufficient in many cases

18
Finally
  • be creative
Write a Comment
User Comments (0)
About PowerShow.com