Title: Building Cyberinfrastructure for Distributed Terascale Computing
1Building Cyberinfrastructure for Distributed
Terascale Computing
Directions Challenges In Emerging Technologies
CAS2001 Tuesday, October 30, 2001
- Don Middleton
- Section Head, NCAR/SCD Visualization Enabling
Technologies
2Cyberinfrastructure?
supercomputers, visualization, human-computer
interactions, high performance communication
networks, the Grid, federated data repositories,
collaboration tools, on-line instruments and
fabrication facilities, digital libraries,
collaboratories, knowledge networks, knowledge
ecologies, and research universities of the
future.
3Emerging Trends
- Big networks with a limited set of applications
that can harness available bandwidth - Coping with distributed, massive data
- Large projects with geographically dispersed
collaborators and resources - New computing languages frameworks
- A dearth of analysis visualization technology
relative to the large problems - Web-based services, science compute portals
- Knowledge
4A New Visualization Lab
Rooms are becoming systems..
Rick Stevens, Argonne National Labs
5Visual Computing Lab
- A sophisticated intentionally designed
workspace for groups - Wide-screen VR/3D Display
- Visual supercomputing
- Collaboration capabilities
- Controlled electronically (fly-by-wire)
- Ultimately support for remote visualization
6Previsualization
7Reality
8(No Transcript)
9(No Transcript)
10(No Transcript)
11http//www-fp.mcs.anl.gov/fl/accessgrid/
12Enabling group-to-group interaction in persistent
electronic spaces
And ultimately collaborative visual analysis
13Access Grids areeverywhere
- About 50-60 in the U.S.
- Mostly universities
- DOE, NSF, DOD
- Boeing Phantom Works, Ford, Motorola, Microsoft
- Sites rapidly coming online internationally
- U.K. Manchester, Glasgow
- Germany Bremen, Stuttgart
- Italy, Korea, Japan (2), Australia, Brazil
- Coming soon
- Puerto Rico, Switzerland, Netherlands, Cardif
(U.K.), Switzerland, Heidelberg (Germany), and
another Stuttgart
14Why Build One?
15(No Transcript)
16Building an Access Grid
Internet Multicast
Other AG Sites
Other AG Sites
About 50K in equipment
Audio
Display
Control
Video Ingest
Echo Processing
Cameras
House Audio
17Visual Supercomputing
Collaborative Terascale Data Visualization over
the Internet
18A Crossroadsin Visualization
- Visual Supercomputers have stagnated and are
generally not keeping pace with PC product cycle - Chromium/WireGL Project A framework for
constructing Beowulf-style clusters of
inexpensive PCs for visualization - Informal partnership with Stanford University and
LLNL, NCAR reviewing, testing, evaluating - Status Building local cluster for evaluation
19WireGL
20NCAR/SCD Visualization Lab
VR Wall Access Grid Displays
Local Displays
Lightwave Switch
Remote Displays
Web Grid
GigE
3TB SAN
Origin2000 8xCPU
Onyx2 2 x IR2
Access Grid
Origin200
WireGL Cluster
Fiber Channel Switch
21Terascale Analysis Visualization Complex
Vislab
MSS Proxy
dataproc
Fiber Channel Matrix
Storage Area Network (7TB)
22Realizing the Promise of High-Bandwidth Networks
Net100
- Web100 http//www.web100.org
- Develop software tools to realize 100Mb/s over
high-performance networks - Partners NCAR, PSC, NCSA
- Net100 http//www.net100.org
- Building network-aware operating systems
- Eliminate the Wizard Gap, anticipate a Net1000
- Partners NCAR, PSC, LBNL, ORNL
23Dealing with Data
- Web-based management and access and addressing
distributed terascale data
24The Community Data Portal (CDP)
An environment and suite of services aimed at
enabling both the consumers as well as the
suppliers of important community datasets.
- Integrated hardware/software environments that
facilitate easy provision of data and broad
access - Browser and distributed client access
- A flexible multi-technology, multi-service
approach
Driven by science programs
25(No Transcript)
26(No Transcript)
27(No Transcript)
28Browser Application Access
Portal Layer (e.g. LAS)
Servers Services
Ferret
GrADS
Ferret
CDAT
DODS
ARCAS
Vemap
TimeGCM
Reanalysis
COLA
NOMADS
Storage
29Future Earth System Modeling
30The Earth System Grid II
- Enable management and distributed
access/processing/analysis of terascale climate
data - Build upon ESG-I, Globus Toolkit?, DataGrid
technologies, CDP, and deploy - Potential broad application to other areas
- http//www.earthsystemgrid.org
31The Earth System Grid II
- Funding DOE SciDac Collaboratory Pilot
- Partners NCAR, USC Information Sciences
Institute, LBNL, PCMDI, ORNL, ANL - Community Relationships
- Community Data Portal (CDP)
- Other DOE SciDac Projects
- Distributed Oceanographic Data System (DODS)
- THREDDS
- U.K. eScience Proposal for Oceans
- Developing Net100/Web100 relationship
32Current and Emerging Workin Analysis and
Visualization
33NCAR Version of Vis5D
- Provisions for large data
- Capabilities for driving 3D displays
- Drivers for VRML and other renderers
Original Vis5D has evolved into Open Source
Vis5D project
NCAR Version being integrated into Vis5D at
SourceForge repository
34(No Transcript)
35MM5 Simulation of Diana
- 1060 E-W x 1000 N-S domain
- Resolution 1.2km x 1.2km x 37 levels
- Timestep 1 hour, approx. 2GB
- Duration 2 days (34 wall-clock hrs on 552
processors) - Size 100GB
- Temporal sampling was inadequate, need 5m -gt
1.2TB result
36(No Transcript)
37Research MesoscaleModels 2004?
- 2000km x 2000km domain
- Resolution 1km x 1km x 50 levels
- Timestep approx. 12GB
- Duration 5-7 days, sample _at_ 5minutes
- Size 24TB
- Doesnt address ocean wave models
38Remote Visualization
Scientific Desktops
Visual Supercomputers
Image Streams
Massive Data Simulation Retrospective
Visual Supercomputer
39VGEE Visual Geophysical Exploration Environment
- Building visualization tools and environments for
inquiry-based education, Java vis/data frameworks
(e.g. VisAD), advanced probes, collaboration - NSF funded effort with the Digital Library for
Earth System Education (DLESE), UIUC, Univ. of
Georgia, Westchester College (Pa.), THREDDS - Status Evaluating in classroom early 2002
- http//www.dlese.org/vgee
40VGEE Visual Geophysical Exploration Environment
41Java for HPC?
- Challenges
- Numerical efficiency
- Primitive types
- Memory management
- Opportunities
- Threading
- Distributed capabilities
- A growing body of class libraries
- A growing pool of practitioners
42Assertions
- Our ability to simulate and observe is outpacing
our ability to analyze, visualize, understand,
and communicate - Our own existing tools and most other available
tools are inadequate (scalability, performance,
features) relative to the larger problems - While there are numerous one-offs, demos, and
sub-critical efforts, it is not evident that
viable efforts exist which will address our needs
in the coming years (or even today)
43Qualities of Future AnalysisEnvironments
- Effective coupling of 2D 3D capabilities
- Portability from desktops to large parallel
systems - Efficient terascale data-handling and better data
models - Multi-resolution capabilities
- Collaboration capabilities
- Open Source, need frameworks for community
contributions and integration of promising tech - Feature detection and tracking?
44Next-generation Visualization and Analysis
There are lots of good efforts, but will there
be any broadly useful tools resulting?
?
PACI efforts
DOE ASCI Scidac
University Vis Research
COTS
NCL
Vis5D/VisAD
Vtk, HDF5
45Integrative Directions
- Build and extend best of breed emerging
technologies - Infrastructure for integrating the most promising
developments from computer science research - Couple analysis and visualization tool
development with model and other emerging
frameworks
46Science Portals and Collaborative Environments
- Delivering services and new capabilities using
advanced information technologies - Access Grid for training, seminars, outreach,
collaborative RD projects, enhanced interaction
with universities, ultimately collaborative
data exploration - Community Data Portal ESG
- SCD Computing Portal
47Looking Towards the Future A Bigger Picture
- A Knowledge Environment
- For the Geosciences (KEG)
48Futures - KEG A Knowledge Environment for the
Geosciences
- A Knowledge-Enabled
- Collaborative
- Problem-solving environment
- For the Geosciences
Data access mining, PSEs, collaboration,
analysis and visualization, multi-scale Earth
System modeling, advanced software architectures
49Futures - KEG A Knowledge Environment for the
Geosciences
- An NSF Information Technology Research Proposal
- Partners NCAR divisions, Univ. of Wisconsin
(visualization), Purdue (PSEs), ANL
(collaboration Grid), Stanford (knowledge),
Univ. of Illinois (data), Univ. of Mich
(collaboratory frameworks), Univ. of Alabama
(data-mining metadata), UCLA (earth system),
Howard Univ. (education)
50PuttingIt All Together
Towards a Knowledge Environment Access Grid, Web
Services, Data Portals, Computing Portals
Environments
Future Visualization Analysis Frameworks
Applications
Apps
The Earth System Grid
Data
Net100, Web100, Chromium/WireGL
Protocols
Fabric
Advanced Computational Facilities Networks
51End
52CAS2001Theater
53DWD TriVis
54(No Transcript)
55VisualizingClimate Models
56(No Transcript)
57(No Transcript)
58(No Transcript)
59Hurricane Diana (MM5)
60(No Transcript)
61MOZART
62(No Transcript)
63Clear Air Turbulence
64DC-8
65(No Transcript)
66SimulatingWildfires
67(No Transcript)
68End of CAS2001Theater