Title: On Site Review Template
1Climate Research at the National Energy Research
Scientific Computing Center (NERSC) Bill
Kramer Deputy Director and Head of High
Performance Computing CAS 2001 October
30, 2001
2NERSC Vision
NERSC strives to be a world leader in
accelerating scientific discovery through
computation. Our vision is to provide
high-performance computing tools and expertise to
tackle science's biggest and most challenging
problems, and to play a major role in advancing
large-scale computational science and computer
science.
3Outline
- NERSC-3 Successfully fielding the worlds most
powerful unclassified computing resource - The NERSC Strategic Proposal An Aggressive
Vision for the Future of the Flagship Computing
Facility of the Office of Science - Scientific Discovery through Advanced Computing
(SciDAC) at NERSC - Support for Climate Computing at NERSC Ensuring
Success for the National Program
4FY00 MPP Users/Usage by Scientific Discipline
NERSC FY00 MPP Users by Discipline
NERSC FY00 MPP Usage by Discipline
5NERSC FY00 Usage by Site
MPP Usage
PVP Usage
6FY00 Users/Usage by Institution Type
7NERSC Computing Highlights for FY 01
- NERSC 3 is in full and final production
exceeding original capability by more than 30
and with much larger memory. - Increased total FY 02 allocations of computer
time by 450 over FY01. - Activated the new Oakland Scientific Facility
- Upgraded NERSC network connection to 655 Mbits/s
(OC12) 4 times the previous bandwidth. - Increase archive storage capacity with 33 more
tape slots and double the number of tape drives - PDSF, T3E, SV1s, and other systems all continue
operating very well
8Oakland Scientific Facility
- 20,000 sf computer room 7,000 sf office space
- 16,000 sf computer space built out
- NERSC occupying 12,000 sf
- Ten-year lease with 3 five-year options
- 10.5M computer room construction costs
- Option for additional 20,000 sf computer room
9HPSS Archive Storage
- 190 Terabytes of data in the storage systems
- 9 Million files in the storage systems
- Average 600-800 GBs Data transferred/day
- Peak 1.5 TB
- Average 18,000 files transferred/day
- Peak 60,000
- 500-600 Tape mounts/day
- Peak 2000) (12/system)
10NERSC-3 Vital Statistics
- 5 Teraflop/s Peak Performance 3.05 Teraflop/s
with Linpack - 208 nodes, 16 CPUs per node at 1.5 Gflop/s per
CPU - Worst case Sustained System Performance measure
.358 Tflop/s (7.2) - Best Case Gordon Bell submission 2.46 on 134
nodes (77) - 4.5 TB of main memory
- 140 nodes with 16 GB each, 64 nodes with 32 GBs,
and 4 nodes with 64 GBs. - 40 TB total disk space
- 20 TB formatted shared, global, parallel, file
space 15 TB local disk for system usage - Unique 512 way Double/Single switch configuration
11Two Gordon Bell-Prize Finalists Are Using NERSC-3
- Climate Modeling -- Shallow Water Climate Model
sustained 361 Gflop/s (12) S. Thomas et al.,
NCAR.
- Materials Science -- 2016-atom supercell models
for spin dynamics simulations of magnetic
structure of iron-magnanese/cobalt interface.
Using 2176 processors of NERSC 3 showed a
sustained 2.46 teraflop/s M. Stocks and team at
ORNL and U. Pittsburgh with A. Canning at NERSC
Section of an FeMn/Co interface shows a new
magnetic structure that is different from the
magnetic structure of pure FeMn.
12NERSC System Architecture
FDDI/ ETHERNET 10/100/Gigbit
REMOTE VISUALIZATION SERVER
MAX STRAT
SYMBOLIC MANIPULATION SERVER
IBM And STK Robots
DPSS
PDSF
ResearchCluster
CRI T3E 900 644/256
CRI SV1
MILLENNIUM
IBM SP NERSC-3 Phase 2 2532 Processors/ 1824
Gigabyte Memory/32 Terabytes of Disk
LBNL Cluster
VIS LAB
13NERSC Strategic Proposal
- An Aggressive Vision for the Future of the
Flagship Computing Facility of the Office of
Science
14The NERSC Strategic Proposal
- Requested In February, 2001 by the Office of
Science as a proposal for the next five years of
the NERSC Center and Program - Proposal and Implementation Plan delivered to
OASCR at the end of May, 2001 - Proposal plays from NERSCs strengths, but
anticipates rapid and broad changes in scientific
computing. - Results of DOE review expected at the end of
November-December 2001
15(No Transcript)
16High-End Systems A Carefully Researched Plan for
Growth
A three-year procurement cycle for leading-edge
computing platforms
Balanced Systems, with appropriate data storage
and networking
17(No Transcript)
18NERSC Support for the DOE Scientific Discovery
through Advanced Computing (SciDAC)
19Scientific Discovery Through Advanced Computing
Combustion
Materials
DOE Science Programs Need Dramatic Advances in
Simulation Capabilities To Meet Their Mission
Goals
Global Systems
Health Effects, Bioremediation
Subsurface Transport
Fusion Energy
20LBNL/NERSC SciDAC Portfolio Project Leadership
21Applied Partial Differential Equations ISIC
Developing a new algorithmic and software
framework for solving partial differential
equations in core mission areas.
- New algorithmic capabilities with
high-performance implementations on high-end
computers - Adaptive mesh refinement
- Cartesian grid embedded boundary methods for
complex geometries - Fast adaptive particle methods
- Close collaboration with applications scientists
- Common mathematical and software framework for
multiple applications
Participants LBNL (J. Bell, P. Colella), LLNL ,
Courant Institute, Univ. of Washington, Univ.
of North Carolina, Univ. of California, Davis,
Univ. of Wisconsin.
22Scientific Data Management ISIC
- Goals
- Optimize and simplify
- Access to very large data sets
- Access to distributed data
- Access of heterogeneous data
- Data mining of very large data sets
Petabytes
Tapes
Scientific Simulations Experiments
Terabytes
Disks
SDM-ISIC Technology
Data Manipulation
Data Manipulation
- Optimizing shared access from mass storage
systems - Metadata and knowledge- based federations
- API for Grid I/O
- High-dimensional cluster analysis
- High-dimensional indexing
- Adaptive file caching
- Agents
20 time
- Using SDM-ISIC technology
- Getting files from tape archive
- Extracting subset of data from files
- Reformatting data
- Getting data from heterogeneous, distributed
systems - Moving data over the network
80 time
Scientific Analysis Discovery
80 time
Participants ANL, LBNL, LLNL, ORNL, GTech, NCSU,
NWU, SDSC
Scientific Analysis Discovery
20 time
23SciDAC Portfolio NERSC as a Collaborator
24Strategic Project Support
- Specialized Consulting Support
- Project Facilitator Assigned
- Help defining project requirements
- Help with getting resources
- Code tuning and optimization
- Special Service Coordination
- Queues, throughput, increased limits, etc.
- Specialized Algorithmic Support
- Project Facilitator Assigned
- Develop and improve algorithms
- Performance enhancement
- Coordination with ISICs to represent work and
activities
25Strategic Project Support
- Special Software Support
- Projects can request support for packages and
software that are special to their work and not
as applicable to the general community - Visualization Support
- Apply NERSC Visualization S/W to projects
- Develop and improve methods specific to the
projects - Support any project visitors who use the local
LBNL visualization lab - SciDAC Conference and Workshop Support
- NERSC Staff will provide content and
presentations at project events - Provide custom training as project events
- NERSC staff attend and participate at project
events
26Strategic Project Support
- Web Services for interested projects
- Provide areas on NERSC web servers for interested
projects - Password protected areas as well
- Safe sandbox area for dynamic script
development - Provide web infrastructure
- Templates, structure, tools, forms, dynamic data
scripts (cgi-gin) - Archive for mailing lists
- Provide consulting support to help projects
organize and manage web content - CVS Support
- Provide a server area for interested projects
- Backup, administration, access control
- Provide access to code repositories
- Help projects set up and manage code repositories
27Strategic Project Area Facilitators
28NERSC Support for Climate Research
Ensuring Success for the National Program
29Climate Projects at NERSC
- 20 projects from the base MPP allocations with
about 6 of the entire base resource - Two Strategic Climate Projects
- High Resolution Global Coupled Ocean/Sea Ice
Modeling Matt Maltrud _at_ LANL - 5 of total SP hours (920,000 wall clock hours)
- Couple high resolution ocean general circulation
model with high resolution dynamic thermodynamic
sea ice model in a global context. - 1/10th degree (3 to 5 km in polar regions)
- Warren Washington, Tom Bettge, Tony Craig, et al.
- PCM coupler
30Early Scientific Results Using NERSC-3
- Climate Modeling 50km resolution for global
climate simulation run in a 3 year test. Proved
that the model is robust to a large increase in
spatial resolution. Highest spatial resolution
ever used, 32 times more grid cells than 300km
grids, takes 200 times as long. P. Duffy, LLNL
Reaching Regional Climate Resolution
31Some other Climate Projects NERSC staff have
helped with
- Richard Loft, Stephen Thomas and John Dennis,
NCAR - Using 2,048 processors on NERSC-3,
demonstration that dynamical core of an
atmospheric general circulation model (GCM) can
be integrated at a rate of 130 years per day - Inez Fung (UCB) - CSM to build a Carbon Climate
simulation package using the SV1 - Mike Wehner - CCM to do large scale ensemble
simulations on T3E - Doug Rotman Atmospheric Chemistry/Aerosol
Simulations - Tim Barnett and Detlaf Stammer PCM runs on T3E
and SP.
32ACPI/Avantgarde/SciDAC
- Work done by Chris Ding and team
- comprehensive performance analysis of GPFS on IBM
SP (supported by Avant Garde). - I/O performance analysis, see http//www.nersc.go
v/research/SCG/acpi/IO/ - numerical reproducibility and stability
- MPH a library for distributed multi-component
environment
33Special Support for Climate Computing
- NCAR CSM version 1.2
- NERSC was the first site to port NCAR CSM to
non-NCAR Cray PVP machine - Main users Inez Fung (UCB) and Mike Wehner (LLNL)
- NCAR CCM3.6.6
- Independent of CSM, NERSC ported NCAR CCM3.6.6
to NERSC Cray PVP cluster. - See http//hpcf.nersc.gov/software/apps/climate/cc
m3/
34Special Support for Climate Computing cont.
- T3E netCDF parallelization
- NERSC solicited user input for defining parallel
I/O requirements for the MOM3, LAN and CAMILLE
climate models (Ron Pacanowski, Venkatramani
Balaji, Michael Wehner, Doug Rotman and John
Tannahill) - Development of netCDF parallelization on T3E was
done by Dr. RK Owen at NERSC/USG based on
modelers requirements - better I/O performance,
- master/slave read/write capability
- support for variable unlimited dimension
- allow subset of PEs open/close netCDF dataset
- user friendly API
- etc.
- Demonstrated netCDF parallel I/O usage by
building model specific I/O test cases (MOM3,
CAMILLE). - netCDF 3.5 official UNIDATA release includes
added support provided by NERSC for
multiprocessing on Cray T3E. http//www.unidata.u
car.edu/packages/netcdf/release-notes-3.5.0.html - Parallel netCDF for IBM SP under development by
Dr. Majdi Baddourah of NERSC/USG
35Additional Support for Climate
- Scientific Computing and User Services Groups
have staff with special climatic focus - Received funding for a new climate support person
at NERSC - Will provide software, consulting, and
documentation support for climate researchers at
NERSC - Will port the second generation of NCAR's
Community Climate System Model (CCSM-2) to
NERSC's IBM SP. - Put the modified source code under CVS control so
that individual investigators at NERSC can access
the NERSC version, and modify and manipulate
their own source without affecting others. - Provide necessary support and consultation on
operational issues. - Will develop enhancements to NetCDF on NERSC
machines that benefit NERSC's climate
researchers. - Will respond in a timely, complete, and courteous
manner to NERSC user clients, and provide an
interface between NERSC users and staff.
36NERSC Systems Utilization
IBM SP 80-85 Gross utilization
T3E 95 Gross utilization
37NERSC Systems Run large Jobs
T3E
38Balancing Utilization and Turnaround
- NERSC consistently delivers high utilization on
MPP systems, while running large applications. - We are now working with our users to establish
methods to provide improved services - Guaranteed throughput for at least a selected
group of projects - More interactive and debugging resources for
parallel applications - Longer application runs
- More options in resource requests
- Because of the special turnaround requirements
of the large climate users - NERSC established a queue working group (T.
Bettge, Vince Wayland at NCAR) - Set up special queue scheduling procedures that
provide an agreed upon amount of turnaround per
day if there is work in it (Sept. 01) - Will present a plan at the NERSC User Group
Meeting, November 12, 2001 in Denver, about job
scheduling
39Wait times in regular queue
Climate jobs
All other jobs
40NERSC Is Delivering on Its Commitment to Make the
Entire DOE Scientific Computing Enterprise
Successful
- NERSC sets the standard for effective
supercomputing resources - NERSC is a major player in SciDAC and will
coordinate it projects and collaborations - NERSC is providing targeted support to SciDAC
projects - NERSC continues to provide targeted support for
the climate community and is acting on the input
and needs of the climate community