Title: The LEAD Effort at Unidata
1The LEAD Effort at Unidata
- The Unidata Seminar will start at 130 PM MST
2The LEAD Effort at Unidata
- Tom Baltzer, Brian Kelly, Doug Lindholm, Anne
Wilson - December 14, 2005
3- LEAD is funded by the National Science Foundation
under the following Cooperative Agreements - ATM-0331594
- ATM-0331591
- ATM-0331574
- ATM-0331480
- ATM-0331579
- ATM-0331586
- ATM-0331587
- ATM-0331578
4Outline
- Setting the Stage Introduction to LEAD and
Unidatas LEAD Efforts Anne - Application of current technology on the LEAD
testbeds Tom - The LEAD Hardware at Unidata Brian
- The THREDDS Data Repository Doug
5Setting the Stage Introduction to LEAD and
Unidatas LEAD EffortsAnne Wilson
6Current IT Barriers to Mesoscale Weather Research
and Education
- Data and tools useable mainly by experts
- Researchers and educators constrained by hardware
limitations - Rigid, brittle technology cant accommodate
mesoscale weather research requirements - real time, on demand, dynamic data processing and
sensor steering
7A Solution Linked Environments for Atmospheric
Discovery (LEAD)
- Funded by NSF Large Information Technology
Research (ITR) award - Produce a web service based, scalable framework
for handling meteorological data and model
output - Identifying, accessing, preparing, assimilating,
predicting, managing, analyzing, mining,
visualizing - Independent of data format and physical location
- Dynamically adaptive workflows and steering of
sensors
8The LEAD Vision
- Data access via querying, and browsing
- Analysis and forecast tools that can be composed
into workflows - Workflows and sensors that respond to the weather
- Support users ranging from grade 6 to experienced
researchers
9LEAD Objectives
- Lower the barrier for entry and increase the
sophistication of problems that can be addressed
by complex end-to-end weather analysis and
forecasting/simulation tools - Improve our understanding of and ability to
detect, analyze and predict mesoscale atmospheric
phenomena by interacting with weather in a
dynamically adaptive manner - Result Paradigm change in how experiments are
conceived and performed
10LEAD Challenges
11Multidisciplinary Effort
- Meteorology
- Computer Science and Information Technology
- Education and Outreach
12LEAD Institutions
gt 100 scientists, students, technical staff
13LEAD Thrust Groups
- Data
- Orchestration
- Portal
- Meteorology
- Grid and Web Services Test Bed
- Education and Outreach Test Bed
- Major Unidata areas
14LEAD Data Subsystem
Public Data (e.g. IDD data)
LEAD Data Repository (LDR)
15Unidata Technology Used in LEAD
- LDM/IDD Data Delivery near real time data
delivery - THREDDS catalogs of data and their associated
metadata - Common Data Model (CDM) single interface to
multiple data formats - THREDDS Data Server (TDS) integrated OPeNDAP and
http data access - Integrated Data Viewer (IDV) visualization
- THREDDS Data Repository (TDR) data storage
framework - Decoders
16Unidata and LEAD
- Unidata also brings
- Experience with atmospheric data
- Community of users
- Robust, fielded software
17Recent LEAD-Related Efforts
- 2. Application of current technology on our LEAD
testbed Tom - 3. Structure of the LEAD testbed Brian
- 4. THREDDS Data Repository Doug
18Application of Current Technologies on the LEAD
Testbed Systems
19Acronyms for LEAD Tools
- ADAS - ARPS Data Assimilation System
- (Center for Advanced Prediction of Storms
at OU) - ADaM - Algorithm Development and Mining
- (University of Alabama at Huntsville)
- IDV Integrated Data Viewer
- (Unidata)
- LDM/IDD Local Data Manager/Internet Data
Distribution - (Unidata)
- OPeNDAP Open-source Project for a Network Data
Access Protocol - (OPeNDAP.org)
- THREDDS Thematic Real-time Environmental
Distributed Data Services - TDS - THREDDS Data Server
- TDR THREDDS Data Repository
- (Unidata)
20LEAD Testbed Systems
- Testbed systems at several LEAD locations to
provide - Data
- Near Real-Time data ingest, storage and access
- LEAD Data Product storage and access
- Data Processing
- High Performance Computing
- Grid and Web Services
- Allow each institution to develop methods by
which their capabilities fit into LEAD effort - Single Web Portal system at Indiana Univ. to
bring it all together and provide User Interface
21MU
CSU
HU
Unidata
UI
IU
UNC
OU
UAH
LEAD Grid
Core Academic Partner Education Test Bed
Core Academic Partner
Core Academic Partner Grid Test Bed
Core Academic Partner Grid Test Bed Education
Test Bed
22Data Aspects of LEAD Testbeds
23LEAD Testbed Systems
- UPC Technologies being leveraged to facilitate
LEAD needs - LDM/IDD
- THREDDS
- IDV
- NetCDF Decoders
- OPeNDAP (Unidata supported)
24Typical LEAD Testbed (Current Source Data
Configuration)
LEAD Grid System
Weather station observations
Testbed
System
THREDDS Catalog
OPeNDAP
IDD
Aircraft data
Decoders
GridFTP
Radar data
25Typical LEAD Data Testbed (Future Source Data
Configuration)
LEAD Grid System
Weather station observations
Testbed
System
THREDDS Catalog
OPeNDAP
IDD
TDS TDR
Aircraft data
Decoders
GridFTP
Radar data
Note UPC plans 6 month store
26LEAD Processing on the Unidata Testbed System
27UPC Processing Testbed (Current Configuration)
- WRF being Steered by Chizs GEMPAK
precipitation locator
NCEP NAM (Eta) Forecast
Initial and Boundary Conditions
Precipitation Locator
THREDDS Catalog
WRF
Center Lat/Lon
Regional Forecasts
OPeNDAP Access
WS-Eta
Unidata LEAD Test Bed
28Next Steps
NCEP NAM (Eta) Forecast
Initial Conditions
Center Lat/Lon
Boundary Conditions
Precipitation Locator
THREDDS Catalog
WRF
Regional Forecasts
OPeNDAP Access
WS-Eta
Unidata LEAD Test Bed
29Longer Term
NCEP NAM (Eta) Forecast
Boundary Conditions
ADaM
ADAS
Precipitation Locator
WRF
Center Lat/Lon
THREDDS Catalog
Regional Forecasts
OPeNDAP Access
WS-Eta
Unidata LEAD Test Bed
30Ultimately
LEAD Grid System
NCEP NAM (Eta) Forecast
Boundary Conditions
Web Service ADaM
Web Service ADAS
Precipitation Locator
Web Service WRF
Center Lat/Lon
THREDDS Catalog
Regional Forecasts
OPeNDAP Access
WS-Eta
Unidata LEAD Test Bed
31Objectives for UPC Testbed
- Testing ground for integration new UPC and LEAD
technologies - Determining ways to bring LEAD Technologies to
the Unidata Community - Operational environment for LEAD
- Processing cluster
- Data Storage
- 6 months of IDD data
- LEAD product data
32The LEAD Hardware at Unidata
33Existing LEAD Infrastructure
Lead3 HTTP Server THREDDS Server OpenDAP
Server LDM Node NFS Server Cluster Node
Lead1 GRID Server Development Tools NFS
Server Cluster Node
Lead4 TDS LDM Node NFS Server Cluster Node
Lead2 GRID Server NFS Server Cluster Node Cluster
Monitoring
LeadStor 8 TB of Disk NFS Server
34Portal Servers for Web, TDS, Grid and LDM Services
UCAR/Unidata LEAD Infrastructure
30 GFLOP Processing Cluster
40 TB Storage Cluster
35HTTP, TDS and Grid Server
LDM Server
Test Server
Processing Cluster Head Node
Storage Cluster Gateway
Gigabit Network for NFS Storage Access
LEAD Portal Systems
36LEAD Processing Cluster
Beowulf Cluster Connected by a Gigabit Fibre
Network
Each Node contains Two Athlon 2400 CPUs Cluster
Uses OSCAR with the MPICH MPD Eight Nodes is 30
GFLOPs
37LEAD Storage Cluster
LEAD Storage Head Node
LEAD Storage Gigabit Network
LEAD Storage Nodes
38- One (1) Guanghsing GHI-583 5U Case
- 24 hot swapable SATA trays
- 1000W 22 power supply
- One (1) Tyan Thunder K8SD Pro Motherboard Dual
Opteron CPUs - Four 64-bit 133/100 Mhz PCI-X Slots
- Two Gigabit Ethernet ports
- One (1) AMD Opteron 242 Processor
- 1.6 Ghz CPU
- Three (3) Broadcom RAIDCore BC4853
- Eight SATA ports
- Controller spanning
- Advanced raid
- Twenty-Four (24) Seagate Barracuda ST3400832AS
- 7200 RPM 400GB SATA Drives
LEAD Storage Node
39LEAD Storage Node
Twenty-Four (24) 400 GB Drives
Divided into Two (2) Eleven Column RAID 5
Arrays and Two Hot Spares
Form Two (2) 4 TB LUNs Using bcraid
Each Node Publishes the Two LUNS over iSCSI
40LEAD Storage Gateway
- Mounts Each Node's Two (2) 4 TB LUNs Published
via iSCSI - Builds Two (2) 20 TB 6 column RAID 5
Meta-devices using mdadm - Divides Each Meta-device into Volume using LVM
- Each Volume is Formatted with an XFS Filesystem
- Each Filesystem is Published with NFS
Result 40 TB of mid-performance double-redundant
storage
41THREDDS Data Repository (TDR)
42LEAD ArchitectureData Storage Perspective
LEAD Data Grid
43LEAD ArchitectureData Storage Perspective
Cataloger (myLEAD)
LEAD Data Grid
Atomic Capabilities
44LEAD ArchitectureData Storage Perspective
Forecast Model (WRF)
Data Assimilation (ADAS)
Data Mining (ADAM)
Cataloger (myLEAD)
Visualization (IDV)
LEAD Data Grid
Application Services
Atomic Capabilities
45LEAD ArchitectureData Storage Perspective
Forecast Model (WRF)
Data Assimilation (ADAS)
Portal
Data Mining (ADAM)
Cataloger (myLEAD)
Visualization (IDV)
LEAD Data Grid
Application Services
User
Atomic Capabilities
46LEAD ArchitectureData Storage Perspective
Forecast Model (WRF)
Data Assimilation (ADAS)
Portal
Data Mining (ADAM)
Cataloger (myLEAD)
Visualization (IDV)
LEAD Data Grid
Application Services
User
Atomic Capabilities
47LEAD ArchitectureData Storage Perspective
Forecast Model (WRF)
Data Assimilation (ADAS)
Portal
Data Mining (ADAM)
Cataloger (myLEAD)
Visualization (IDV)
LEAD Data Grid
Application Services
User
Atomic Capabilities
48LEAD ArchitectureData Storage Perspective
Forecast Model (WRF)
Data Assimilation (ADAS)
Portal
Data Mining (ADAM)
Cataloger (myLEAD)
Visualization (IDV)
LEAD Data Grid
Application Services
User
Atomic Capabilities
49LEAD ArchitectureData Storage Perspective
Forecast Model (WRF)
Data Assimilation (ADAS)
THREDDS Data Repository
Portal
Data Mining (ADAM)
Visualization (IDV)
LEAD Data Grid
Application Services
User
Data Repository
Atomic Capabilities
50THREDDS Data RepositoryComponent Architecture
Data Storage
Name Resolver
Metadata Crosswalk
Metadata Generator
Data Mover
Storage Locator
Unique ID Generator
Cataloger
locate- Storage()
move- Data()
generate- UniqueID()
mapID- ToURL()
generate- Metadata()
translate- Metadata()
catalog- Metadata()
THREDDS Data Repository
putData()
getData()
discoverData()
51THREDDS Data RepositoryComponent Architecture
Data Storage
Name Resolver
Metadata Crosswalk
Metadata Generator
Data Mover
Storage Locator
Unique ID Generator
Cataloger
locate- Storage()
move- Data()
generate- UniqueID()
mapID- ToURL()
generate- Metadata()
translate- Metadata()
catalog- Metadata()
THREDDS Data Repository
putData()
getData()
discoverData()
52THREDDS Data RepositoryComponent Architecture
Data Storage
RLS
myLEAD
Resource Broker
Unique ID Generator
THREDDS Metadata Generator
trebuchet
THREDDS to LEAD Crosswalk
locate- Storage()
move- Data()
generate- UniqueID()
mapID- ToURL()
generate- Metadata()
translate- Metadata()
catalog- Metadata()
THREDDS Data Repository
putData()
getData()
discoverData()
LEAD Configuration
53THREDDS Data RepositoryComponent Architecture
Data Storage
Data Mover
Storage Locator
THREDDS Metadata Generator
THREDDS Catalog
locate- Storage()
move- Data()
generate- UniqueID()
mapID- ToURL()
generate- Metadata()
translate- Metadata()
catalog- Metadata()
THREDDS Data Repository
putData()
getData()
discoverData()
Alternate Configuration
54Unidata Architecture
55Unidata Architecture
access
56Unidata Architecture
access
discover
57Unidata Architecture
access
discover
58Unidata Architecture
access
THREDDS Data Server (TDS)
discover
59Unidata Architecture
access
THREDDS Data Server (TDS)
discover
THREDDS Data Repository (TDR)
store
60Unidata Architecture
access
THREDDS Data Server (TDS)
discover
THREDDS Data Repository (TDR)
store
store
Locally Generated Data
store
61Unidata Architecture
access
THREDDS Data Server (TDS)
discover
THREDDS Data Repository (TDR)
store
store
E-mail
Locally Generated Data
notify
store
Application (e.g. IDV)
Service
62Questions?