Title: The ALMA Software Architecture
1The ALMA Software Architecture
- Joseph Schwarz (ESO), Allen Farris (NRAO), Heiko
Sommer (ESO)
2ALMA is
- An array of 64 antennas each 12 meters in
diameter which will work as an aperture synthesis
telescope to make detailed images of astronomical
objects. - They can be positioned as needed with baselines
from 0.5 to 14 kilometers so as to give the array
a zoom-lens capability, with angular resolution
reaching 10 milliarcseconds. - A leap of over two orders of magnitude in both
spatial resolution and sensitivity - ALMA's great sensitivity and resolution make it
ideal for medium scale deep investigations of the
structure of the submillimeter sky. - A joint project of the North American and
European astronomical communities Japan is
likely to join the project in 2004.
Location the Llano de Chajnantor, Chile, at an
altitude of about 5000m.
Courtesy of Al Wootten, ALMA/US Project Scientist
3Software Scope
- From the cradle
- Proposal Preparation
- Proposal Review
- Program Preparation
- Dynamic Scheduling of Programs
- Observation
- Calibration Imaging
- Data Delivery Archiving
- Afterlife
- Archival Research VO Compliance
4The numbers
- Baseline correlator H/W produces 1 Gbyte/s
- Must reduce to average/peak data rates of 6/60
Mbyte/s (baseline) - Raw (uv) data 95, image data 5 of the total
- Implies 180 Tbyte/y to archive
- Archive access rates could be 5 higher (cf. HST)
- Proposed 25/95 Mbyte/s to support correlator
enhancements - Feedback from calibration to operations
- 0.5 s from observation to result (pointing,
focus, phase noise) - Science data processing must keep pace (on
average) with data acquisition
5ALMA is distributed (1)
6ALMA is distributed (2)
Antenna Operations Site (AOS, 5000m)
40 MB/s (peak)
6 MB/s (average)
OperationSupportFacility (OSF)
Santiago Central Office (SCO)
7ALMA is distributed (3)
MPI Bonn
ATC
Jodrell Bank
Edinburgh
Univ.
Calgary
DRAO
c
Penticton
ESO
c
ALMA
ATF
DAMIR/IEM Madrid
NAOJ
NRAO
ALMA
Santiago
Arcetri
Brera
IRAM
Observatory
Obs de
Observatory
Grenoble
Paris
8(No Transcript)
9Run-time Challenges Responses
- Changing observing conditions
- High data rates
- Diverse user community (novice to expert)
- Distributed hardware personnel
- AOS antennas scattered 0.5-14 km from correlator
- AOS-OSF operators are 50 km from array
- OSF-SOC-RSCs PIs, staff, separated from OSF by
1000s of km, often by many hours in time zone
- Dynamic Scheduler
- Integrated scalable Archive
- Flexible observing tool, GUIs
- High-speed networks
- Distributed architecture
- CORBA CORBA services
- Container/Component model
- XML serialization
10Development-time challenges responses
- Evolving requirements
- Changing data rates
- New observing modes
- New hardware (ACA)
- IT advances
- Distributed development
- Different s/w cultures
- Iterative development
- Modular, flexible design
- Unified architecture (HLA)
- Functional subdivisions aligned to existing
project organization - Implemented via ACS
- Dont do it twice
- If you must do the same thing, do it the same way
everywhere - E-Collaboration tools
11Why dynamic scheduling?
12Scheduling Block
- Indivisible unit of observing activity
- Can be aborted but not restarted in the middle
- Independently calibratable
- Can be queried
- What h/w do you need?
- What conditions do you need?
- Nominal execution time 30 minutes
- Scheduler repeats selection process
13(No Transcript)
14(No Transcript)
15System data flow
0.5 s feedback time
4-40 Mbyte/s
16The Archive at the Core
- Just about everything goes here
- Much more than what we usually think of as a
science archive - High streaming in/out rates, lower random access
rates - Three types of data
- Bulk data high volume, moderate s of records
- Stored as binary attachments to VOTable headers
- Monitor (engineering) data moderate volume,
large of records - Value objects low volume, complex searchable
structures - Observing Projects Scheduling blocks
- Configuration information
- Meta-data providing link to bulk data (e.g., via
VOTables) - Underlying DB technology hidden from subsystems
- Can be replaced when necessary/convenient/desired
17(No Transcript)
18Separation of concerns
- Functional Physics, algorithms, hardware
- Pis can concentrate on their research specialties
- Software encapsulates aperture synthesis
expertise - Technical Communications, databases, etc.
- Subsystem teams should concentrate on function
- Technical architecture should provide simple and
standard ways to - Access remote resources
- Store and retrieve data
- Manage security needs
- Communicate asynchronously between subsystems,
components
19ALMA Common Software (ACS)
- Main vehicle for handling technical concerns
- Framework for distributed object architecture
- Used all the way down to device level in control
system - Built on CORBA, but hides its complexity
- Wraps selected CORBA services
- Multi-language support Java, C, Python
- Vendor-independent
- High-quality open-source ORBs available (e.g.,
TAO) - System evolving to meet developers needs
- Initial developer resistance
- Developers now asking for more
- Dedicated team of systems-oriented developers
20Components Containers
- Component
- Deployable unit of ALMA software
- Arbitrary of components per subsystem
- Functional interface defined in CORBA IDL
- Well-defined lifecycle
- Focus on functionality with little
overhead for remote communication deployment - Similar ideas in EJB, .NET, CCM
- Container
- Centrally handles technical concerns and hides
them from application developers - Run-time deployment, Start-up
- Selected CORBA/ACS Services (Error, Logging,
configuration, ) - Convenient access to other components and
resources - New functionality can be integrated in the
future, w/o modifying application software
21Container/Component Interfaces
My container starts and stops me and offers its
services, some of which I dont notice
functional interface observe()
container service interface
Comp
lifecycle interface init() run() restart()
I only care about the Lifecycle IF of my
components
other ACS services
Manager deployment configurations
CORBA ORBs Services
22Data reduction pipelines
- Baseline AIPS as data reduction engine
- Audited for compliance w/ALMA reqts
- Suitability for mm ? verified w/PdB data
- Systematic benchmarking has led to major
performance improvements - Re-architecting of AIPS framework as ACS
components with Python replacing glish as
scripting language - Phase A proof of concept completed
- Python container implementation provided by ACS
team
23Role(s) of XML
- Define data structure and content through XML
schemas - Binding classes from schemas
- Type-safe native language access to data
- Automatic validation possible
- Exchange of value objects between subsystems
- In a language-independent way
- Direct input to archive
- Encourages subsystem-specific data modelling
24Binding XML Schemas
Castor is an open-source framework for binding
XML schemas to Java classes
25Goals
- A data model that can be understood by everyone
on the project - A data model that can be modified without
resulting in chaos for the project - Software that
- Encapsulates access to this data
- Creation (factories), get/set, serialization,
archive store/retrieve, - Maintains relationships restrictions built into
model - Does the same thing the same way everywhere
26Data Modeling Strategy
- Build maintain model in UML
- Expressive, flexible and widely-used
- Supported by numerous s/w tools (e.g., Rose)
- Use framework to generate code, docs
- Java, C, Python, XML schemas, html
- Model compliance ? consistent w/each other
- Propagate model changes quickly
- We, not the tool, determine what kind of code is
generated
27Why UML instead of XML?
- ALMA Data Model is very complex
- Easier to manage in UML
- UML more expressive
- Model relations among classes
- Model subsystem create use dependencies
- Better visual capability
- Partial views of one model possible
28How we do it
- Build model in UML
- Define our own (ALMA) meta-model element
categories - Divide our model elements into these categories
- Classes ? Entities, dependent classes,
- Attributes ? Primitive types, value types,
29Build the model in UML
30Define ALMA Metamodel
31Divide elements into metamodel categories
32How we do it (2)
- Decide how each meta-model element should be
mapped into code and/or docs - e.g., Entity ? include entity ID, generate
factory class, archive store/retrieve for those
subsystems that use it - Implement decisions in framework templates
- Run code generator verify results
- Package as libraries, jarfiles distribute
33(No Transcript)