Title: Component updates, SSS-OSCAR Releases, API Discussions, External Users, and SciDAC Phase 2
1Component updates, SSS-OSCAR Releases, API
Discussions, External Users, and SciDAC Phase 2
Al Geist January 25-26, 2005 Washington DC
2Scalable Systems Software
IBM Cray Intel SGI
ORNL ANL LBNL PNNL
SNL LANL Ames
NCSA PSC SDSC
Participating Organizations
Problem
- Computer centers use incompatible, ad hoc set of
systems tools - Present tools are not designed to scale to
multi-Teraflop systems
Goals
- Collectively (with industry) define standard
interfaces between systems components for
interoperability - Create scalable, standardized management tools
for efficiently running our large computing
centers
To learn more visit
www.scidac.org/ScalableSystems
3Participating Organizations
Coordinator Al Geist
Participating Organizations
ORNL ANL LBNL PNNL
PSC IBM SGI
SNL LANL Ames NCSA
Cray Intel
Running suite at ANL, and Ames Running components
at PNNL Maui w/ SSS API (3000/mo), Moab (Amazon,
Ford, TeraGrid, ) How do we position ourselves
with respect to the - National Leadership-class
facility? NLCF is a partnership between ORNL
(Cray), ANL (BG), PNNL (cluster) - NERSC and
NSF centers
4Goals for This Meeting
Updates on the Integrated Software Suite
components Preparing for next SSS-OSCAR
software suite release quarterly releases
this year Planning for SciDAC phase 2
whitepaper and meeting with MICS director
Results of Scalability tests
Warehouse Discussion of Less Restrictive Syntax
5Scalable Systems Software Suite
Any Updates to this diagram?
Grid Interfaces
Components written in any mixture of C, C,
Java, Perl, and Python can be integrated into the
Scalable Systems Software Suite
Meta Scheduler
Meta Monitor
Meta Manager
Meta Services
Accounting
Scheduler
System Job Monitor
Node State Manager
Service Directory
Standard XML interfaces
Node Configuration Build Manager
authentication communication
Event Manager
Allocation Management
Usage Reports
SSS-OSCAR
Process Manager
Job Queue Manager
Hardware Infrastructure Manager
Validation Testing
Checkpoint / Restart
6Highlights of Last Meeting (Aug. 26-27 at ANL)
Details in Main project notebook
FastOS presentations - SNL, ORNL, ANL, LBL, and
LANL Discussed what they proposed and how SSS can
help. SC04 Suite Release version 1.0. What do
we show/demo at SC? API Discussions - Less
Restrictive Syntax introduced SSSRMAP version 3
described. Discussion of accomplishments, the
need to get our software on large clusters,
priorities for wrapping up project.
7Since Last Meeting
- Hackerfest meeting to prepare release 1.0
- October 6-8 at ORNL
- SC04 SSS posters and demos
- Ames, ANL, LBL, PNNL
- Telecoms
- Every Tuesday - Resource Management group
- Every other Thursday Process Management group
- New entries in Electronic Notebooks
- Five notebooks provide a dynamic SSS web site
- Over 360 pages of ideas, APIs, and meeting notes
8Major Topics for This Meeting
- Latest news on the Software Suite components
- Preparing for third SSS-OSCAR software suite
release - Planning for SciDAC phase 2 and meeting with MICS
director February 17 - whitepaper
- all day meeting with 1 ½ hour for SSS
- Presentation of System and Job Monitor API
- Presentation of Less Restrictive Syntax
- Vote on Process Manager API
- Also discuss getting our components out on large
clusters - Are components robust enough to use at NLCF or
NERSC? - Ssslib version with RMAP
- Fred asks if we can incorporate NWPerf, and OTP
security
9Agenda January 25
800 Continental Breakfast 830 Al Geist -
Project Status 900 Fred Johnson - MICS report,
next steps in scidac 930 Scott Jackson -
Resource Management components 1030 Break
1100 Will Mclendon - Validation and Testing
1200 Lunch (on own in hotel) 130 Paul
Hargrove Process Management and Monitoring 230
Narayan Desai - Node Build, Configure 330 Break
400 Craig Stefan - "Warehouse" system
monitoring package 500 Rusty Lusk Present
Less Restrictive Syntax, 530 Adjourn for dinner
10Agenda January 26
800 Continental Breakfast 830 Thomas
Naughton - Preparing for next SSS OSCAR
software release 900 Rusty - discussion and
vote on Process Manager API 930 incorporating
NWperf into SSS suite 1000 Group discussion of
whitepaper, closure of this project, and ideas
for what to propose next. 1030 Break 1100
Discussion - finalizing the Whitepaper, Meeting
with Mike Strayer in February 17, SciDAC PI
Meeting in June (invited to give poster), Set
next meeting date May 10-11 in SanFran or
ANL 1200 Meeting Ends
11CS ISIC Presentations February 17
Presentations will be made to head of SciDAC by
all four CS ISICS One hour presentation followed
by 30 minutes for discussion Each will also
prepare a whitepaper Outline What are the System
Software challenges for SciDAC Goals of project
standardized and flexible API modular
architecture (portable across HW) scalable
reference implementation Highlights (include
impact) XML based API independent of
language and protocol (ssslib) architecture
that allows plug and play (SD, EM, meatball)
suite releases (SSS-OSCAR, integrated
components) production users (ANL, Ames,
PNNL, NCSA?, others?) adoption of API by
existing products (Maui, Moab) Future CS ISIC
ideas National Leadership Computing
facility (Cray, IBM BG, SGI, clusters)
FastOS (discussions from last meeting) Cray
software roadmap (Al and Rusty attending from SSS)
1 page
2 pages
1 page
12SciDAC Science teams
CapabilityPlatform
Ultrascale Hardware Rainer, Blue Gene, Red
Storm HW teams
Research team
High-End science problem
Tuned code
Computational Science Teams
Software Libs
SW teams
BreakthroughScience
Unified Computing Environment Common lookfeel
across diverse HW
13Meeting notes
Al Geist presents project overview and goals
for this meeting Fred whats going on in MICS.
Presidents FY06 budget out in couple weeks. 80B
going to the war budgets going to be tight rest
of decade SciDAC and CS base program are not
going to be impacted (they avoid the otherwise
10 cut across the board) Viz and data management
taken over by Fred in MICS office Recruiting
second Math PM in MICS office to help Gary Ed
Oliver is being replaced this summer. New
Secretary of Energy SciDAC PI mtg Late June in
San Francisco (URL) Much broader emphasis on
science Math ISICs held get together to discuss
SciDAC in December And generated common
whitepaper for Strayer who has Math interest CS
ISIC he has broader view. What Is going in the
ISICs Up to 4 people per ISIC will meet in
building 3 blocks from HQ Issues first round
scidac got Math ISICs and CS ISICs Should
scidac have a more blended ISICs, we have 5 yrs
experience Whitepaper 2-5 pages written to
Michael who is advocate for SciDAC Vision,
progress, success/impact, gaps/opportunities
14Meeting notes
- What is the vision changes in scidac
- Petascale systems, blended ISICs, gaps that you
(Fred) see - Havent sat down and asked What are the
scientific goals yet - SciDAC Call in early October
- CS gaps SDM understanding gigantic data sets
(viz and analysis) - - should viz be a part of scidac ? Common viz
infrastructure needed? - PERC ongoing emphasis on understanding scidac
app performance - SSS understand what happens with and without
system software - how to handle SGI (NUMA) if it matters
- first 5 yr lot of infrastructure build up what
next step - Getting vendor buy in, plus Linux cluster
strategy - CCA transition of apps, what is rate of adoption
- SciDAC is a layered approach.
- Math and CS working with apps teams
- ISICs initial impact from math side of house, CS
would take longer - We have had time so now how do we help?
- In future we will have Large Heterogeneous
systems across DOE - Parallel File system research do we need to add
to scidac ?
15Meeting notes
- Compute plateform in scidac-1 was singular NERSC
IBM. - SciDAC this will be much more challenging,
heterogeneous - How much integration SSS can help with? Incl.
File system - Portability and Common software base/environment
on DOE systems - To Strayer - Site wants to use PBS-pro how to do
this with SSS - Lines of code in SSS, RAS strategy, HPSS support,
data migration - Number of daemons (scalability) and any kernel
mods? - Security and NWPerf are two things to consider
for SSS
16Meeting notes
Scott Jackson Resource Management
components Python based SSSRMAP SDK almost
finished need integrate w/ other SDK Craig
initial efforts on SSSRMAP integration into
ssslib General evidence of adoption and value of
SSS components Bamboo supports chkpt/restart,
PBS/Loadleverler syntax Gold production use on
multiple PNL systems incl. 11.8 TF
cluster dozens of downloads, began discussions
with DOD HPCMP Maui support for chkpt, enhanced
prioritization, throttling, and QOS installed on
2,500 clusters, downloaded 100,000 running more
supercomputers than any other in world 17 of top
20 and 75 of top 100 in top 500 list.
17Meeting notes
Scott Jackson Resource Management components
(cont) Full support for SSSRMAP v3 Gold
improvements over Qbank, improved
robustness Added support for SQLite embedded
database (fast as postgress) New reservation
design Maui has added buffer overflow prevention
for security Silver metascheduler also called
Grid scheduler using SSS job object and message
communication protocols. Available for release 3
months Handles cross site data staging, grid
fairness, multi-cluster job allocation MCOM
common library between Maui and Silver Future
work Portability testing Linux (Red Hat fading
out) Fedora, Sosa, AIX, Tru-64, OS-X Fault
Tolerance supporting 25 cluster loss Multisite
authentication and authorization Future work
focused on Silver development
18Meeting notes
Will McClendon Validation and Testing Mostly
doing bug fixing on current release v0.2.5 Test
driver tool for testing software ordered tests
and API tests APItest is now in SSS-OSCAR
v1.0 Finishing up User Guide Gives overview of
APItest and screenshots Future work Being able to
diff files Configuration file Add more SSS
component tests He gives a short demo Ron
Oldfield setting up SSS integrated test
suites Hired contractor full time this month to
durability and performance tests He presents talk
on Lightweight File System Project w/ Lee Ward,
himself Risk mitigation for red storm FS
(Lustre) Initial focus on secure storage
architecture (not a FS) nice work
here. Describes this project Lee Ward knows the
answer but cant express it Barney Mcabe knows
the answer but it is wrong.
19Meeting notes
Paul Hargrove Process Management
components Checkpoint Manager BLCR
status Handles Files if unmodified between
checkpoint and restart, or only appended between,
or pipes between processes. MPI if LAM/MPI7.x
over tcp and GM including migratable task
if OpenMPI since it will inherit LAM/MPI support
if ChaMPIon/Pro (Verari) Platforms IA32
only using (future x86_64 but no plans for
IA64) Linux Stock 2.4.x, SuSE7.2-9.0,
RedHat7.2-9, RHEL3/CentOS3.1 2.6.x port in
process (FC2 SuSE 9.2) Future work Cover
process groups and sessions Handle directories
and mutable files SSS integration chpt manager
works w/ Bamboo, Maui, MPDPM upgrade API to
LRS Process manager being hardened and converted
to LRS preparation for BG/L (one full rack)
20Meeting notes
Craig Warehouse components Starting to monitor
does not wait for all connections to
finish Connection and monitoring thread pools are
independent Future need to add full reset Any
component can be restarted no longer depends on
start order. Testing Ran on Platinum cluster
at NCSA on 120 nodes Infinite Itanium cluster
(128 processors) T2 cluster (Dell Xeon 500
nodes) xtorc warehouse test did autodiscovery
of new clients David Boxer, RA, doing main
programming on warehouse. IBM hired him. Future
Work API to Node State Manager Intelligent error
handling
21Meeting notes
Narayan Desai Build configure
components Conversion to LRS is in process not
trivial BG/L arrived will run SSS software but
there are some issues Single process/node, no
direct TCP support, RAS interface
unusual, allocation granularity must be
2n Compute node OS is reloaded for each job
(like Scyld model) Chiba RM (scheduler, QM,
allocation manager) used as-is New implementation
of PM system partitioning, BG/L kernel
loading, PMI implementation New configuration
management components different diagnostic
model Craig Stefan New Warehouse information
storehouse Node info list and Resource
Consumers Goes over a couple examples New
protocol design and sender DONE Protocol parser
NOT DONE
22Meeting notes
Rusty Lusk Less Restrictive Syntax Best of both
words. See notes from last meeting on discussion
of Two families of syntax. Command language in
XML that Identify a set of objects,
specification of function, construct
response Desirable features completeness,
validation, readability, conciseness Goes over
example of differences between RS and LRS Goes
over the BNF (detailed normal form) Then goes
over several examples of LRS. Much more
readable. LSF needs a better name. Suggestion by
Paul to call it S5 Thomas Naughton SSS
OSCAR V1.0 release Nov 04 with all SSS components
represented. Preparing v1.1 release for Feb 15
still oscar3.0 based Shift to oscar4.1 in v1.2
release in 2Q 2005 Future extend SSS component
tests, improve documentation, ordering Longer
term support more Linux types, make it an OSCAR
package set Release dates this year (v1.2
Fedora core2 May 15), Aug 15, v2.0 SC05
23Meeting notes
Rusty Lusk Process Manager API and vote Goes
over the spec as written in the Process Manager
Notebook Functionality start/stop process groups,
query state of job, deliver signals. Uses Less
Restrictive Syntax for its five
commands CreateProcessGroup GetProcessGroup Signal
ProcessGroup KillProcessGroup WaitProcessGroup Da
tatypes ProcessGroup and ProcessGroupSpecificatio
n Events start of job and end of job Go through
some examples w/ discussion Vote to accept
Process Manager API Yes- 12 No-0
Abstain-0 Al Geist CS gaps for the
whitepaper Unified Software Environment across
diverse systems including development
(interactive) environment Other gaps are I/O,
Fault tolerance, and security