Title: Jordan C. AlpertNCEPEMC jordan.alpertnoaa.gov
1EMC
NOAA Operational Model Archive Distribution
SystemNOMADS Toward an Operational Service
Orientated Architecture
NCEP/Environmental Modeling Center
Jordan C. Alpert/NCEP/EMC jordan.alpert_at_noaa.gov
Glenn Rutledge/NCDC Jun Wang/SAIC
5th GO-ESSP Earth Science Portal Workshop, June
20, 2006 at Lawrence Livermore National
Laboratory
where the nations climate and weather services
begin
2NOMADS Web Page
- Completely
- re-designed.
- Uses php
- Easier navigation
- Better
- Documentation
- NOMADS is
- renamed
- NOAA National
3 What is service oriented architecture?
Vision
NOMADS is a distributed service oriented
architecture, a system-of-systems integration
based on using loosely coupled connections among
independent systems to create a scalable,
extensible, interoperable, reliable, and secure
framework.
reliable?
4What is NOMADS? (cont)
- A digital archive of NOAAs operational weather
models, and an innovative data access philosophy
to promote interoperable access across the
geosciences (BAMS, Rutledge et. al., 2006). - A Pilot project at NCEP that advances integration
of real time model data. - NOMADS An integrator of common web services
infrastructure to support the discovery, access
and transport of data (NOAA GEO-IDE Document).
5The NOAA Operational Model Archive and
Distribution System
NOMADS Goals
- provide distributed access to models and
associated data,
- promote model evaluation and product
development,
- foster research within the geo-science
- communities (ocean, weather, and climate)
- to study multiple earth systems using
- collections of distributed data,
- develop institutional partnerships via
distributed open technologies.
6Service oriented Goals (DMIT)
Goals
- To take advantage of internet technology
opportunities. - Improve efficiency and reduce costs by bridging
the barriers between existing, independent stove
pipe systems. - Integrate the data management activities of NOAA
projects. - Individual components retain responsibility and
authority within the context of a systematic set
of principles. - To Develop and adopt standards for metadata, data
discovery and data transports, formats and
protocols.
7 The operational (reliability) problem
To provide Operational Services where security,
timeliness (time critical) and reliability are
paramount. Operational public access services
for public access to data, products and
information services. Scientific services where
efficient and flexible discovery and access to
data sets is required. Commercial value-added
services and user client applications.
8NOAA/NCEP Operational ftp Services For Model Data
- Two centers (ftpprd and operational official
tgftp) - Standard ftp servers
- Load balance, Fail over, State of the art
hardware and 24/7 support. - A little difficult to navigate (e.g., file
naming convention) - Entire files (0.25GB) must be downloaded when
only small portions may be necessary. - Holdings not complete
- (0.5 degree GFS, NAM hourly, Ensembles)
- Complete and fast ftp (Partial ftp transfers),
a NOMADS user client application, for official
ftp data set holdings is under discussion.
The understatement of the year
9(No Transcript)
10What the user sees at the operational ftp site
Example of tgftp location for GFS model (first
entry) files from June 15
ftp//tgftp.nws.noaa.gov/SL.us008001/ST.opnl/MT.gf
s_CY.00/RD.20060615/PT.grid_DF.gr1
An Excerpt from the list of 500 files, note name
and size
File fh.0030_tl.press_gr.onedeg 26562 KB
06/15/2006 File fh.0030_tl.sflux
42981 KB 06/15/2006
Excerpt from the ftpprd inventory description
Inventory of File gfs.t00z.pgrbanl Model
GFS Cycle 00 UTC Forecast 0 HRS Number of
Records 267 Grid Identification 3 Number
Level/Layer Parameter Description 0001 1000
ISBL HGT 1000 hPa isobaric level Geopotential
height gpm 0002 975 ISBL HGT 975 hPa
isobaric level Geopotential height gpm 0003
950 ISBL HGT 950 hPa isobaric level
Geopotential height gpm
11Toward Operational services
- We know how to do the hardware part.
- load balance, fail over, 24/7, redundancy
- What about users usage of the applications, e.g.,
DAP and GDS and others - Operationalise data flow to NCDC from NCEP.
- GRIB2 with jpeg packing increases CPU resources
needed. - Software to handle sub-seting and GRIB2
- GDS, OPeNDAP, and other applications need GRIB2
software stratagy. - Implement with partial http transfers and index
files.
12Highest month 4.3Tb w/3.8 million downloads
NCEP has 1-5Tb /month
User Statistics
NCDC Only
BAMS Paper
13Real Time (NCEP) nomad3 server
- GDS/OPeN(DAP) DODS typically 250,000
queries/day from the GDS log (already /2). - pdisp (Great Displays) 2694/day.
- http access_log 200,000 hits/day
- ftp2u (fast ftp index download) over 5/11 6/15
- 8,623 repackaged GRIB downloads/day.
- Need better ways of evaluating use for good
health and for comparison of servers.
14Real Time (NCEP) NOMADS Server
- Users query files that are not present in tight
loops. - During the 12 hours 00Z-12Z on June 1, of the
371 different users who queried the httpd server
error_log, the top 7 users with the most queries
are listed - queries User
- 1579 hawaii.edu
- 1616 abo.wanadoo.fr
- 1870 natpool.mwn.de
- 3293 accuweather.com
- 4039 saildocs.com
- 7864 hmg.inpg.fr
- 100465 labsolar.ufsc.br
A solution Check log file (error_log) user
access rate. If the rate criteria (6/minute) is
exceeded then place users IP address in IP
tables to block access for that IP address for a
period of time (10 minutes). Placing users in
a penalty box when they are repeatedly
accessing the server in tight loops is the same
as users placing a sleep 600 command or wait 10
minutes in their unix script for loop. This
represents a throttle that can be applied to
other situations such as many GDS queries when
one is needed.
15Multiple paths to format independent data access
The NOMADS System Design
16Search, Discovery, Access, and Analysis
- A metadata server, DIMES (Distributed Metadata
Server) integrated with GDS to form MIDAS
(Kafatos, Yang, Zhao) - Metadata Integrated Data Analysis Server
- Content consistency between data server and
metadata server. - Interactive metadata search and data access and
analysis. - Potential for more effective and efficient
searching. - Potential for better interfaces.
- More seamless transition from data search to data
access. - Uses XML to represent, store, retrieve and
interoperate metadata with minimum semantic
enforcement. - Contains a metadata model, XML query engine,
web-based prototype interface. - Enables the application power of Grads and GDS
server side data analysis.
17One can search by Time resolution (forecast
time), Spatcial resolution, Search_Space
(region), Search_Time (Cycle), and Text (variable
and other Title text in the Metadata file.
18(No Transcript)
19 After selection of the Time and Space Box, the
user can generate an sdfopen URL command
including functionality, e.g, time or space
averaging, for further processing or generate
the DODS URL and run GrADS client script and
interactive commands.
20Future Plans
- Incorporate NOMADS functionality into CLASS
(Comprehensive Large Array Stewardship system) - Enhance the NCEP real time servers at NCDC and
place in NCDC operations. - Operationalise the data flow from NCEP to NCDC.
- Existing Operational NOAA servers can be enhanced
with NOMADS service orientated architecture and
administrators. - Support working prototypes for proof of concept
and continued development toward improved
services.