Title: New Software for Ensemble Creation in the Spitzer-Space-Telescope Operations Database
1New Software for Ensemble Creation in the
Spitzer-Space-Telescope Operations Database
- Russ Laher and John Rector
- 2004 ADASS XIV Conference
- October 24 - 27, 2004
2Preface
- About one third of the 230 Spitzer
data-processing pipelines require multiple input
images (e.g., calibrations, image co-adds
mosaics) - Motivation is data noise reduction and/or
statistical characterization of the data - Input images are grouped for particular pipeline
processing into what we call ensembles in the
operations database
3Outline
- Powerpoint Presentation
- Introduction
- Background
- Purpose of Talk
- Database storage of ensembles
- Ensemble-creation rules
- Ensemble-creation software
- Conclusions
- Future Work
- URL of long version of paper
- http//spider.ipac.caltech.edu/staff/laher/sirtf/N
ewEnsembleCreation.pdf - Appendices
- A. On-line software tutorial
- B. Spitzer ensemble-creation rules
- C. S/W output, test mode
- D. S/W output, normal mode
4Background
- Spitzer rules for ensemble creation are well
documented and under version control. - Spitzer pipeline-operator Ron Beck created the
first version of a script for executing the
ensemble-creation rules - Rules are hard coded (and therefore hard to
change) - Direct SQL is used for DB access (open/close DB
connection for each access) - New database-design improvements and software
have been developed for increased speed and
flexibility
5Purpose of this Talk
- To acquaint you with SSC methodologies for
creating/storing ensembles, including - Database design
- Ensemble-creation rules
- Debut our new ensemble-creation software
- New database tables and schema changes
- New database stored functions
- Identify general concepts used in
creating/storing ensembles (for application to
other astronomical missions)
6Hierarchy of Spitzer Observations
In cluster mode, there may be multiple
exposures per cluster of observations (clusterPosN
um)
At scheduling time, the pipeline picker assigns
to each DCE a pipeline for initial processing
(initPlScriptId)
7Miscellaneous Considerations
- Ensembles can be created in the database after
the observations are scheduled (it is not
necessary to have received the actual DCEs from
the spacecraft) - Wouldnt it be nice to store with each ensemble
in the database information about the rule
applied in creating it?
8Database Storage of Ensembles
- There are three database tables for storing
information about how (instances of) ensembles
are defined (which DCEs are included and how they
are to be processed) - DCEs are grouped explicitly into DCE sets (via
association of dceIds with an dceSetId) - The type of pipeline ensemble processing to be
done is stored with the ensemble (plScriptId is
assocated with ensId)
9Database Storage of Ensembles (cont.)
- A DCE set is stored with one or more ensembles
(dceSetId is associated with ensId) - An ensemble is characterized in the database by
dceSetId and plScriptId - Two or more ensembles can be associated together
for processing a set of ensembles by creating a
new ensemble with NULL dceSetId and two or more
associations in the ensembleSets database table
10DB Storage of Ensemble Rules
- There are two database tables for storing
ensemble-creation rules - The ensRules database table specifies how DCEs
are to be grouped - The ensPlScripts database table specifies how a
set of DCEs is to be processed (by one or more
different pipelines)
11Database Schema for Ensemble Creation
12Database Stored Functions for Ensemble Creation
Database stored function Return value(s)
getEnsRules() All records
getEnsPlScripts() All records
getReqMode(reqKey) Corresponding reqMode (decoded for instrument name)
deleteAllEnsTempLists() None
getEnsGroupsFrom EnsTempList(ruleId) All records for given ruleId
getEnsSetsFromEnsOf EnsTempList3(ruleId) All records for given ruleId
createEnsembles (ruleId, test) Basic info for all ensembles created or to be created for given ruleId
createEnsembleSets (ruleId, test) Basic info for all ensembles and ensembleSets created or to be created for given ruleId
13Features of ensembleCreation.pl
- Much faster performance is expected because
pre-compiled database stored functions are called - Efficient architecture only a single database
connection is needed - Software complexity is encapsulated in the
database stored functions - Database-table-driven specification of
ensemble-creation rules makes it flexible - On-line tutorial (lists options, switches, sample
command lines) - Useful, thoughtfully-organized diagnostic outputs
- Test mode to verify effect of ensemble-creation
rule, without actually having to create ensembles
in the database - Post-mortem debugging capability via direct SQL
querying of database temporary tables
14 Flow Chart for createEnsembles.pl
15Conclusions
- Increased speed in creating database records for
ensembles is achieved by using database stored
functions - Flexibility in adding/changing ensemble-creation
rules is achieved by storing the rules in the
database - Several small improvements were implemented, as
well (e.g., storing the minimum number of DCEs
with the ensemble-creation rule, storing the
corresponding ruleId with each ensemble in the
database)
16Future Upgrades
- Add new option to execute selected
ensemble-creation rules - Specify comma-separated list of ruleIds
- Application is augmenting existing set of
ensembles - Add new option to create ensembleSets from
existing ensembles - Specify ruleId and ensPlScriptId
- Application is linking together existing
ensembles (e.g., process the data for all reqKeys
in a given 12-hour PAO to flag pixels with latent
images)