Title: Computing
1Computing
2Outline
- Context
- Management
- PDR results
- Software Overview / Architecture
- Science testing plans
- AIPS
- Pipeline
3ALMA Context
- Timeline Interim operations 2007.5, Regular
operations 2012.0 - Computing is one of 9 Integrated Product Teams
- 35M / 552M (Computing/Total) 6.3
- FTE-y ratio 230/1515 15
4Scope
- Computing
- Software development
- Necessary operational computer equipment
- System/network administration (operations phases)
- Subsystems proposal preparation, monitoring,
dynamic scheduling, equipment control and
calibration, correlator control and processing,
archiving, automated pipeline processing, offline
processing - Not embedded software (hardware groups),
algorithm development (Science) - Activities management, requirements, analysis,
common software, software engineering,
integration and test
5Scope (2)
- Scope Observing preparation and support through
archival access to automatically produced
(pipeline) images
6Development Groups
- 50/50 Europe/North America
- 10 Institutes, 13 Sites!
- Range HIA/Victoria 0.5FTE, NRAO/AOC 15
- Communications difficult, more formal processes
- Access to world expertise
7Buy vs. Build
- Want to be parsimonious with development funds
- Want system sufficiently modern that it will
suffice for the construction period and some time
thereafter (CORBA, XML, Java, C, Python) - Many open tools/frameworks (ACE/TAO, omniOrb, GNU
tools, etc) - After search, ALMA Common Software (ACS) code
base adopted from accelerator community - Simplified CORBA framework, useful services
- Astronomy-domain adopted code bases
- Calibration, imaging AIPS
- Archive NGAST
- Atmosphere ATM
8Strategies
- ACS
- provide a common technical way of working
- Continuous scientific input through Subsystem
Scientists - Synchronized 6-month releases
- A common pace for the project
- Check requirements completion
- Yearly design/planning reviews
- React to surprises
- Retain some construction budget to allow for
discoveries made during interim operations period
9Management Planning
- Computing Plan (framework) and 15 subsystem
agreements prepared - Computing Management Plan approved by JAO
- Subsystem agreements approved by Computing IPT
- Management model
- Agreement contracts followed by subsystem
scientists (scope) and software management
(schedule, cost) - Initial PDR, yearly CDRs and release planning
- Synchronized 6-month releases across all
subsystems - SSR requirements mapped to releases for progress
tracking
10Management Planning (2)
- Readiness reviews, acceptance tests, and delivery
to interim operations in 2007 - Data flow items get a second development period
during interim operations - review requirements after first experience with
operations - Not subject to current agreements institutional
responsibility could shift
11Preliminary Design Review
- March 18-20, Tucson
- Panel
- R. Doxsey (CHAIR) (STScI, Head HST Mission
Office) - P. Quinn (ESO, Head of Data Management and
Operations Division) - N. Radziwill (NRAO, Head of GBT Software
Development) - J. Richer (Cambridge, ALMA/UK Project Scientist,
Science Advisory Committee) - D. Sramek (NRAO, ALMA Systems Engineering IPT
Lead) - S. Wampler (NSO, Software Architect)
- A. Wootten (NRAO, ALMA/North America Project
Scientist)
12Preliminary Design Review (2)
- Well prepared PDR at or ahead of where similar
projects have been at this stage - The ALMA Project needs to develop an
understanding of operations, and the Computing
IPT needs to fold this in to their planning and
priorities. This might result in reprioritization
of SSR requirements - ALMA project needs to define clearly the steps
necessary for getting to, and implementing,
Interim Operations and the consequent
requirements for computing support
13Preliminary Design Review (3)
- The interaction with AIPS requires careful
management. Operational (Pipeline) requirements
on AIPS need further development. - Testing effort seems satisfactory management
will need to follow-up to ensure subsystem unit
tests are in fact carried out
14Software Scope
- From the cradle
- Proposal Preparation
- Proposal Review
- Program Preparation
- Dynamic Scheduling of Programs
- Observation
- Calibration Imaging
- Data Delivery Archiving
- Afterlife
- Archival Research VO Compliance
15And it has to look easy
- 1.0-R1 The ALMA software shall offer an easy to
use interface to any user and should not assume
detailed knowledge of millimeter astronomy and of
the ALMA hardware. - 1.0-R4 The general user shall be offered fully
supported, standard observing modes to achieve
the project goals, expressed in terms of science
parameters rather than technical quantities.
Observing modes shall allow automatic fine tuning
of observing parameters to adapt to small changes
in observing conditions. - Which means that what is simple for the user will
be complex for the software developer. - Architecture should relieve developer of
unnecessary complexity - Separation of functional from technical concerns
- But the expert must be able to exercise full
control
16Observatory tasks
- Administration of projects
- Monitoring and quality control
- Scheduling of maintenance
- Scheduling of personnel
- Security and access control
17The numbers
- Average/peak data rates of 6/60 Mbyte/s
- Raw (uv) data ?, image data ? of the total
- Assumes
- Baseline correlator, 10 s integration time, 5000
channels - Can tradeoff integration time vs. channels
- Implies 180 Tbyte/y to archive
- Archive access rates could be 5 higher (cf. HST)
- Feedback from calibration to operations
- 0.5 s from observation to result (pointing,
focus, phase noise) - Science data processing must keep pace (on
average) with data acquisition
18Meta-Requirements
- Standard Observing Modes wont be standard for
a long time - e.g., OTF mosaics, phase calibration at submm ?
- Instrument likely to change grow
- Atacama Compact Array (ACA)
- Second-generation correlator
- Could drastically increase data rate (possible
even with baseline correlator, might be demanded
for OTF mosaics) - Computer hardware will continue to evolve
- Development spread across 2 continents
cultures
19What do these requirements imply for the
architecture of the software?
- Must facilitate development of new observing
modes (learning by doing) - Must allow scaling to new hardware, higher data
rates - Must enable distributed development
- Modular
- Standard
- Encourage doing the same thing either a) in the
same way everywhere or b) only once.
20(No Transcript)
21Functional Aspects
- Executive, Archive, ACS are global all other
subsystems interact with them. - ACS common software a foundational role
- Executive start, stop, monitor an oversight
role - Archive object persistence, configuration data,
long-term science data a structural support
role - Instrument operations
- Control, correlator, quick-look pipeline,
calibration pipeline real-time - Scheduling (near real-time)
- External subsystems
- Observation preparation and planning
- Science data reduction pipeline (not necessarily
online)
22(No Transcript)
23Scheduling Block Execution
24ALMA Software User Test Plan Status
- Software test plans being developed by SSR
subsystem scientists and subsystem leads. - Test plan components
- Use Cases descriptions of operational modes and
what external dependencies exist. Designed to
exercise subsystem interfaces, functionality,
user interfaces. - Test Cases Use Case subset designed to test
specific functions. - Testing timeline (when tests run in relation to
Releases, CDRs). - Test Definitions - specifies which test case will
be run, what the test focus is, and whether the
test is automated or involves users. - Test Reports (e.g., user reports, audit updates,
summary). - Test Plan drafts for all subsystems to be
completed by Oct 1.
25ALMA Software User Test Plan Status
- Software test plan guidelines
- June 5, 2003 Test Plan approved by Comp mgt
leads. - June 11, 2003 Test Plan presented to SSR. (ALMA
sitescape, SSR draft documents). - Use Case development
- July 9, 2003 Use Case guidelines, html
templates, examples put on the web
(www.aoc.nrao.edu/dshepher/alma/usecases)
presented to SSR. - Aug 2003 Detailed Use Cases being written for
all Subsystems. - Test Plan development
- Sept 2003 Draft test plans to be completed.
- Oct 1, 2003 Test plans problems
identified/reconciled, resources identified - Nov 2003 First User test scheduled (for
Observing Tool subsystem). -
26ALMA Software User Test Plan Status
- Use Cases written to-date
- Observing Preparation Subsystem
- OT.UC.SingleFieldSetup.html Single
Field, Single Line setup - OT.UC.MultiFieldMosaicSetup.html Multi-Field
Mosaic setup - OT.UC.SurveyFewRegionsSeveralObjects.html Set
Up to do a Survey of a Few Regions with Several
Objects. - OT.UC.SpectralSurvey.html Set Up to do a
Spectral Line Survey. - Control Subsystem
- Control.UC.automatic.html Automatic
Operations Use Case. - Offline Subsystem
- Offline.UC.SnglFldReduce.html Reduce
Image Single Field Data - Offline.UC.MosaicReduce.html Reduce
Image Multi-Field Mosaic - Offline.UC.TotPower.html
Reduce, Image Auto-Correlation Data - Pipeline Subsystem
- Pipeline.UC.ProcSciData.html Process Science
Data - Pipeline.UC.SnglFld.html Science Pipeline
Process Single Field Data - Pipeline.UC.Mosaic.html Science Pipeline
Process Mosaic, no short spacings - Pipeline.QLDataProc.html Quick-Look Pipeline
Data Processing - Pipeline.QLCalMon.html Quick-Look Pipeline
Monitor Calibration Data - Pipeline.QLArrayMon.html Quick-Look Pipeline
Monitor Array Data
27(No Transcript)
28(No Transcript)
29AIPS Evaluation
- AIPS (along with the ESO Next Generation
Archive System) is a major package used by ALMA - Both to ensure at least one complete data
reduction package is available to users and in
implementing ALMA systems (notably the Pipeline) - AIPS is a very controversial package (long
development period, has not received wide
acceptance) - ALMA Computing has arranged several evaluations
- Audit of capabilities based on documentation
- AIPS/IRAM test to test suitability for
millimeter data - Benchmarking tests
- Technical review of AIPS March 5-7 2003
- Sound technical base, management changes needed
30AIPS Audit
Explanatory not results!
Work to be done by ALMA
These should be 0 (in 2007)
These should be lt10 of the total
31(No Transcript)
32AIPS Audit Results - Summary
- All 58 (Acceptable) / 16 (Inadequate) / 16
(Unavailable) / 10 (TBD) - Critical 66 / 14 / 12 / 8
- Important 52 / 19 / 19 / 10
- Desirable 35 / 17 / 33 / 15
- 14 of requirements have had differing grades
assigned by auditors
33AIPS/IRAM Tests
- Phase 1 Can AIPS Reduce real mm wave data?
- Yes, but schedule was very extended
- Partly underestimated effort, mostly priority
setting - ALMA/NRAO and EVLA now directly manages AIPS
- And for the next 12 months ALMA has complete
control of priorities - Phase 2 Can new users process similar but new
data? - Generally yes, but it is too hard
- Phase 3 Performance (described next)
34AIPS Benchmark Status
- Four requirements related to AIPS performance
- 2.1.1 R4 Performance of the Package shall be
quantifiable and commensurate with data
processing requirements of ALMA and scientific
needs of users. Benchmarks shall be made for a
fiducial set of reduction tasks on specified test
data. - 2.2.2 R1.1 GUI window updates shall be lt 0.1s
on same host. - 2.3.2 R4 Package must be able to handle,
efficiently gracefully, datasets larger than
main memory of host system. - 2.7.2 R3 Display plot update speed shall not
be a bottleneck. Speed shall be benchmarked and
should be commensurate with comparable plotting
packages. - ASAC AIPS must be within factor of 2 of
comparable pkgs.
35AIPS Benchmark Strategy
- Finish AIPS/IRAM Phase 3 (performance) test
- Set up automated, web-accessible, performance
regression tests of AIPS against AIPS, Gildas,
and Miriad - Start simple, then extend to more complex data
- Systematically work through performance problems
in importance order - Resolution of some issues will require scientific
input (e.g., when is an inexact polarization
calculation OK) - Decide in Summer 2004 (CDR2) if AIPS
performance issues have arisen from lack of
attention or for fundamental technical reasons
(fatal flaw)
36Full AIPS/AIPS/Gildas/Miriad Comparison not
possible
- Different processing capabilities (polarization)
and data formats - Standard ALMA-TI AIPS PdBI
MIRIAD VLA Export - Package FITS FITS format format
format format - GILDAS
- MIRIAD
- AIPS
- AIPS
- Compare AIPS with GILDAS on one dataset in
ALMA-TI FITS format - Compare AIPS with MIRIAD AIPS on another
dataset in FITS format
37AIPS/IRAM Phase 3(ALMA-sized data, single
field spectroscopic)
- GILDAS/CLIC AIPS A/G Comments
- Filler 1873 10939 5.8
- Init (write header info) 385 n/a
- Fill model/corr data cols. 2140 n/a
- PhCor (Check Ph-corr data) 889
3484 3.9 (AIPS Glish) - RF (Bandpass cal) 5572 2298 0.4
- Phase (Phase cal) 3164 1111 0.4
- Flux (Absolute flux cal) 1900
2093 1.2 (AIPS Glish) - Amp (Amplitude cal) 2242 614 0.3
- Table (Split out calib src data) 1200
5150 4.3 - Image 332 750 2.3
- Total 17600s 28600s 1.6
- Caveats DRAFT results, bug in AIPS bandpass
calibration requires too much memory (1.7GB
AIPS vs. 1.0GB Gildas) - Gildas executables copied, not compiled an
benchmark machine - Several AIPS values still be amenable to
significant improvement
38AIPS Benchmark Status
- SSR has identified 2 initial benchmark datasets
- Pseudo GG Tau PdBI data of 25 March. Original
observation expanded to 64 antennas with GILDAS
simulator source structure converted to point
source. 3 1 mm continuum spectral line
emission. Data in ALMA-TI FITS format (same data
used during AIPS re-use Phase III test). - Ensure continuous comparisons in time with AIPS
Ph III re-use test - Compare core functions (fill, calibrate, image)
on ALMA-size dataset - Exercise mm-specific processing steps
- Polarized continuum data VLA polarized
continuum emission in grav lens 0957561, 6cm
continuum, 1 spectral window. Snapshot
observation extended in time with AIPS
simulator to increase run-time. Data in Std FITS
format. - Exercise full polarization calibration,
self-calibration, non-point source imaging
(polarization processing can only be compared
with MIRIAD/AIPS).
Results to be published on web for each AIPS
stable release.
39Calibrater Performance Improvements vs. Time
40Calibration Performance vs. AIPS
41Calibrater Still TODO
42Imager Performance
- Imaging Performance Improvements
Imaging performance Improved by factor of 1.8
for 2048 pixels. Improved by factor of 4.4 for
4096 pixels. AIPS/AIPS ratio now 1.6 for 2048
pixels 1.8 for 4096 pix. Now dominated by more
general polarization processing in AIPS? This
is I Multi-polarization should be relatively
faster in AIPS, but needs to be demonstrated
Execution Tiime (sec)
Image Size (NxN pixels)
43AIPS Benchmark Status
- Dataset expansion
- SSR will identify datasets in the following
areas - Spectral line, polarized emission. Multi-config
dataset if possible - Multi-field interferometric mosaic
- Large, simulated dataset, includes atmospheric
opacity variations and phase noise - Single-dish interferometer combination in uv
plane (no SD reduction now (MIRIAD/AIPS do not
process SD data, GILDAS only processes
IRAM-format SD data cannot convert to FITS). - NOTE Glish-Based GUIs will be replaced with JAVA
GUIs once ACS/Corba framework conversion is
complete. ? benchmark comparisons affecting GUI
and plotting interface will be delayed until JAVA
GUIs ready to test.
44Pipeline
- Three current development tracks
- Paperwork assemble use cases, write test plans,
develop heuristics decision trees - Implement top-level interfaces required by the
rest of the system (e.g., to start a stub
pipeline when ordered to by the scheduling
subsystem) - Technology development / prototype pipeline
- VLA GRB observations
- Bind AIPS engines to ALMA technology (Python,
CORBA (i.e., ACS) - Gain experience for possible package-independent
execution framework - To be used by at least AIPS
- Allow pipeline computations to be performed by
different packages - Possible relevance for VO