Title: Distributed Processing and Archival of AIRS Science Data on Linux Clusters
1Distributed Processing and Archival of AIRS
Science Data on Linux Clusters
- November 1, 2005
- NOAATECH 2006 Workshop Expo
- Office of Systems Development (OSD)
- Office of Research and Applications (ORA)
- National Environmental Satellite Data Information
Service (NESDIS) - National Oceanic and Atmospheric Administration
(NOAA) - U.S. Department of Commerce
- Shahram Tehranian, Yongsheng Zhao, Viktor Zubko,
Anand Swaroop - (stehranian_at_ac-tech.com, yzhao_at_ac-tech.com,
vzubko_at_ac-tech.com, - aswaroop_at_ac-tech.com)
- AC Technologies, Inc.
- Advanced Computation and Information Services
- 4640 Forbes Blvd, Suite 320
- Lanham, MD 20706
2Topics
- Objective
- Linux Cluster for Real-Time Systems
- High Performance Computing Storage System
- AIRS Source Code Modifications
- Data Management and Communication Software
- Benchmarking Results
- AIRS Retrieval Results
- Conclusions
3Objective
- Provide an execution environment for processing
Atmospheric Infrared Sounder (AIRS) Data in a
distributed environment such as Linux Clusters. - Provide high bandwidth and capacity for accessing
data through a file shared storage system. - Process AIRS data faster and cheaper than what is
currently achieved through shared memory
architecture machines (SGI Altix, SGI Origin). - Optimize AIRS science code for a distributed
architecture.
4Linux Cluster For Real-Time Systems
- Clusters do not have a single point of failure,
and are highly available for mission-critical
applications. - Customers can add functionality and capacity as
needed from dozens to thousands of nodes. - Linux clusters are based on open-source rather
than proprietary standards, customers are not
locked into one vendors solution. - Total cost of ownership is much lower than
traditional shared memory machines.
5Linux Cluster For Real-Time Systems (2)
6Linux Cluster For Real-Time Systems (3)
- Linux Cluster System from Linux Networx
- 18 Dual 0.8U AMD Opteron servers
- Desktop Model 240
- 2 GB DDR SDRAM
- 40.8/120 GB Hard Drive
- Suse Linux Enterprise Server for AMD64
- Myrinet connection among compute nodes
- Gigabit Ethernet connection between the master
and the compute nodes
7Linux Cluster For Real-Time Systems (4)
- IceBox provides
- Serial Terminal Access
- Power Management
- Temperature Monitoring
- Full Integration with Clusterworx for System
Monitoring and Cluster Management - Clusterworx provides
- Total Cluster Management
- System Monitoring
- Version Control Management
- Ganglia
- Distributed monitoring system for
high-performance computing systems such as
clusters and Grids
8Linux Cluster For Real-Time Systems (5)
- HP Linux Cluster System
- 16 node DL145 dual core compute cluster with two
DL385 dual core head nodes - 32 dual-core processors (64 cores) on compute
nodes, and 4 dual-core processors (8
cores) on head nodes. - 2.2 GHz processor speed
- 32x2/4x2 GB Memory
- 16x80/4x72 GB Hard Drive
- Suse Linux Enterprise Server for AMD64
- Gigabit Ethernet and Myrinet connection among
all nodes. - Single system administration through XC System
Software.
9High Performance Computing Storage System
- Large-scale scientific computation often requires
significant computational power and involves
Large Quantities of Data. - Inadequate I/O capability can severely degrade
overall cluster performance. - Distributed file systems such as NFS and AFS do
not satisfy the requirements of today's
high-performance computing environments. - SAN (Storage Area Network) are usually
implemented with Fiber Channel Storage and can be
very costly. - Parallel File systems provide
- Bandwidth of the order of GB/s for Linux clusters
running applications that require fast I/O across
dozens to thousands of distributed compute nodes. - Scalable data serving through parallel data
striping and scalable meta data
10High Performance Computing Storage System (2)
- Parallel File Systems
- GPFS (General Parallel File System)
- Highly available UNIX-style file system for
cluster systems, including Linux clusters and the
RS/6000 SP - (PVFS) Parallel Virtual File System
- PVFS is open source and released under the GNU
General Public License - Lustre (Linux and Clusters)
- Open Source software developed and maintained
under the GNU General Public License. - Highly scalable
- Object-based storage architecture that scales to
tens of thousands of clients and petabytes of
data. - Designed to support transparent fail-over
11High Performance Computing Storage System (3)
- Lustre
- Operating systems
- Red Hat Linux7.1 and SuSE Linux 8
- Hardware platforms
- IA-32, IA-64, Alpha, and Opteron
- Networking
- Quadrics Elan, TCP, Myrinet GM, Infiniband, and
SCI - Three of top Eight supercomputers run Linux
- Lustre runs on all three
12High Performance Computing Storage System (4)
- HP StorageWorks Scalable File Share (HP SFS)
- 16 TB of useable storage
- 2 DL380 MDS/Admin nodes
- 4 DL380 OSS/IO-nodes
- Aggregate bandwidth of 1064 Mbytes/s for READ and
570 Mbytes/s for WRITE. - Support for Gigabit Ethernet and Myrinet
interconnect - Runs Lustre Parallel File System
- Highly scalable
- Scalable capacity of up to 512 TB
- Scalable to over 35 GB/s aggregate bandwidth
13AIRS Source Code Modifications
- Batch System Interactive System
- Changed airsb.F to a callable subroutine which
can be called by the Data Management and
Communication software and accept input
parameters. - All input parameters are passed directly to the
science code through our Data Management and
Communication software. No need for
pre-processing for setting up an execution
environment.
14AIRS Source Code Modifications (2)
- Read namelist files
- .
- do 1000 igroup1, Ngroup1
- .
- Load initial L1b files
- openl1_b AIRS airs_051.bin 12150 (135x90)
obs, 2378 chns - open1l_b AMSU amsu_051.bin 1350 ( 45x30)
obs, 15 chns - open1l_b HSB hsb_051.bin 12150 (135x90)
obs, 5 chns - open2l_b AVN avn_051.bin
- Load coefficient files
- .
- do 2010 npro1, numobs_L11350
- .
- enddo
- .
- close all granule files
- .
- enddo
- Read radiance files (AIRS, AMSSU, HSB)
- .
- do iFOV1, numFOV_am1
- readl1_b AMSU 15 chns
- .
- enddo
- do iFOV1, numFOV_mh9
- readl1_b HSB 5 chns
- .
- enddo
- do iFOV1, numFOV_ir9
- readl1_b AIRS 2378 chns
- .
- enddo
- .
- Read first guess (AVN) file
- .
- Perform a retrieval
- .
15AIRS Source Code Modifications (3)
- Common data blocks are retrieved for the 1st
granule and stored in memory for subsequent calls
- Namelist variables are stored in memory for
subsequent calls to read_nl subroutine. - Coefficient files are loaded at first call and
coefficient variables are retrieved from memory
at subsequent calls.
16AIRS Source Code Modifications (4)
Data Processing
Namelist Files
pro_v40.nl
Granule 1
io_v40.nl
Granule 2
temp_v40.nl
Granule 3
water_v40.nl
. .
ozone_v40.nl
microw_v40.nl
Granule N
Memory Namelist Variables
clouds_v40.nl
17AIRS Source Code Modifications (5) Coefficient
Files
Coefficient Files
Data Processing
AIRS transmittance coefficients
Granule 1
AMSU transmittance coefficients
Granule 2
HSB transmittance coefficients
Granule 3
. .
. .
. .
Granule N
Memory Namelist Variables
US standard atmospheric profiles
18AIRS Source Code Modifications (6)
- Optimize CPU/IO burst cycles
- CPU Utilization keep the CPU as busy as
possible - Added a flag and modified source code to delay
writing data to the output files until after the
processing of each granule. - Added a flag and modified source code for reading
AIRS, AMSU, HSB, and Aviation Forecast files for
an entire granule prior to the processing of each
granule.
19AIRS Source Code Modifications (7)
- Introduced flags i_delay_ret, i_delay_fg, and
i_delay_mit and modified source code to delay the
output of data to .fg, .mit, and .ret files. - Results are stored in memory and are written to
output files (.fg, .mit, .ret) after the
entire granule has been processed. - Introduced flags i_input_airs, i_input_amsu,
i_input_hsb, i_input_avn and modified source code
for reading AIRS, AMSU, HSB, and AVN files for an
entire granule prior to the processing of each
granule.
20Data Management and Communication Software
- Provides an operational platform within which
science algorithm pipelines can be deployed for
data processing. - Hides the complexities involved with an
operational cluster environment. - Separates the science algorithm layer from the
data management and control layer.
21Data Management and Communication Software (2)
- Implemented in C programming language.
- Uses Message Passing Interface (MPI) for
communication - LAM-MPI implementation
- Provides task scheduling through a master
process. - Master process divides a job into parallel tasks
and assigns tasks to worker processes. - Master process spawns worker processes at
startup. - Each task assigned by the master has a unique
task identifier. - Currently each task contains one granule.
- Can be used both with Gigabit Ethernet and
Myrinet interconnect.
22Benchmarking Results
- Processed 1 day worth of Level-1b AIRS data (240
granules) for May 4th 2005. - Compared the Gnu Compiler (gcc, g95) versus the
Portland Compiler. - Data was accessed directly by worker processes
though NFS. - NFS does not provide adequate bandwidth for IO.
- Parallel File System (Lustre, PVFS, GPFS)
necessary to provide adequate bandwidth and
scalability. - Results are presented for Total Time, Speedup and
Efficiency using 1 master process and a maximum
of 30 worker processes.
23Benchmarking Results (2)
24Benchmarking Results (3)
25Benchmarking Results (4)
26Benchmarking Results (5)
Read/Write flag on, 4 workers
Read/Write flag on, 28 workers
27Benchmarking Results (6)
Read flag off / Write flag on, 24 workers
Read flag off / Write flag on, 12 workers
28AIRS Retrieval Results
29AIRS Retrieval Results (2)
30Conclusions
- Provided an initial execution environment for
processing Atmospheric Infrared Sounder (AIRS)
Data in a distributed environment, separating the
science algorithm layer from the data management
and control layer. - Processed one day worth of AIRS/AMSU data and
showed that significant speedup may be achieved - NFS does not provide adequate bandwidth for IO.
- Need to conduct benchmarking using the HP cluster
and HP storage system.
31Thank you
- Wed., 11/2/05100PM-130PM
- The Design and Implementation of an Operational
Data Telemetry Extraction and Analysis Toolkit