Title: Development and Futures of Research Data Archives
1Development and Futures of Research Data Archives
- Steven Worley
- Data Support Section, SCD
- 31 January 2006
2Topics
- Data Access Points
- Reanalyses
- Output Resources
- Dataset Development _at_ SCD
- TIGGE
3Data Access Points for Research Data Archives
(RDA)
- What
- All RDA data and backups
- Access
- Everyone with SCD computing account
MSS
- What
- Much RDA data, all metadata (complete)
- Access
- Public, traditional service (web,ftp,delayed
mode)
RDA Server
- What
- Some RDA data, complete RDA catalog
- Many other UCAR/NCAR catalogs and data
- Access
- Public, interactive and interoperable services
CDP Server
4Reanalyses - output resources
- Major Extant Reanalyses
- NCEP/NCAR, 1948-2005
- 28 sigma, 11 theta, 17 plvl, 2.5º, 1.875º
Gaussian, 6hrly - NCEP/DOE AMIP-II, 1979-2004
- 28 sigma, 17 plvl, 2.5º, 1.875º Gaussian, 6hrly
- NCEP North America Regional, 1979-2005
- Eta/NAM, NOAH, 45 sigma, 29 plvl, 32km, 3hrly
- ERA-40, 1957-2002
- 60 sigma, T159, 2.5º, 22 plvl, 15 theta, 125km
Gaussian, 6hrly - Focused efforts
5NARR Domain
6Reanalyses - output resources
- Future Reanalyses
- JRA-25 by JMA, 1979-2004
- 40L, T106
- 1979-2002 completed, negotiating to get HR copy
- MERRA by NASA, 1979 - current date
- 1º, hydrological cycle focus
- Low resolution testing - now, start July 2006
- ERA-Interim, 1989 ? CDAS
- T255, 91L, 4DVAR, just beginning now
- NOAA ESRL 20th Century
- 1940-1948 in 2006, then 1900 - 1948
- Based on SLP only
7Reanalyses - Dataset Development _at_ SCD
- Feedback records from ERA-40
- AMS IIPS P2.12, Doug Schuster
- Observed data QC metadata variational
analysis system (VAS) metadata - E.g observation - model field departures
- Uses
- Observations with added QC
- Deeper understanding of reanalysis by examining
departures - Whats been done
- Convert BUFR to ASCII and separate surface and UA
archives - All data online software documentation
web/ftp download - Temporal and spatial subset extraction by web
interface request
8Reanalyses - Dataset Development _at_ SCD
9Reanalyses - Dataset Development _at_ SCD
- Note in GRIB T85 4 TB, T106 6 TB
- T85 Mlvl vertical coordinate error in netCDF
- T85 6-hourly 23.plvl has subsetting interface on
RDA Server - T85 Monthly 23.plvl, surface and single level has
subsetting and files on CDP
10Reanalyses - Dataset Development _at_ SCD
- ERA-40 Services from RDA Server
- 24 ERA-40 standard products
- (check mark online, RDA web and FTP file level
access) - Model Resolution Daily (4x) Analysis Fields
- Model Resolution Daily (4x) Forecast Fields
- Model Resolution Monthly Mean Fields
- Model Resolution Monthly Mean Forecast Fields
- 2.5 degree Daily (4x) Analysis Fields
- 2.5 degree Monthly Mean Fields
- Ocean Wave Fields (anal. forecasts)
11Reanalyses - Dataset Development _at_ SCD
- ERA-40 Services from RDA Server
- Subsetting Service (variable, level, and time)
- By email or web form request (delayed mode)
- Pressure level, surface and single levels
- 6-hourly, 2.5
- 6-hourly, model resolution
- By automated processing
- 6-hourly, T85, 23 pressure levels
- Metadatabase to get HP access to GRIB files
- Output formats, GRIB or CF-conforming netCDF
- Spatial constraints to be added in the future
12Reanalyses - Dataset Development _at_ SCD
- ERA-40 Services from CDP Server
- Motivation Provide an additional data access
channel by exploiting netCDF formatted data in an
advanced portal environment - Data Product - Monthly, 2.5, 23 pressure level,
surface and single level
- Features
- Real-time .nc file download
- Variable selection (multiple,OK)
- Spatial selection
- Time selection
- Level selection
- Output - netCDF
13- Improving observational data resources
- Objective
- Compile and provide access to the most complete
and highest quality data collections. - What we seek?
- Historical data not yet in the archives
- Current day data streams to extend the archives
- How?
- Ask questions, investigate leads, request
copies, and encourage free and open data sharing
using the RDA resources as exchange collateral - Quality check the new data received, always
preserve the original archive - Compare with extant archives to gauge potential
improvement - Why?
- Better data assets to support direct atmospheric
research, and future reanalyses
14- Improving observational data resources, Examples
29K profiles Stewardship - needs ship track
check, integrated with other UA data Little used
- not NNR, yes ERA-40
15- Enhancements relative to GTS data receipts
- Increased coverage
- 10-20 years more data
- New data in remote areas
16Improving observational data resources, Examples
- ICOADS, global surface marine observations
- New Release, 2.2, coverage 1794-2004
- New Publication Available (Worley et al, 2005,
IJC) - Collaborating partners, NOAA ESRL and NOAA NCDC
- Access
- Full data set online
- User subset request interface
- select time, space, variables, and filtering
17TIGGE (THORPEX Interactive Grand Global
Ensemble)THORPEX (WMO World Weather Research
Program, THe Observing system Research and
Predictability Experiment)
- Objective Create of a database of ensemble
forecasts from world-wide NWP centers - NWP Participants
- NCEP, CMC, BMRC, ECMWF, JMA, CMA, CPTEC, UKMO
and KMA - TIGGE Archive Centers
- NCAR (SCD), ECMWF and CMA
- Project Features
- GRIB2 data format
- _at_ NCAR, user access through CDP and the MSS
- Receive HR native grids
- Up to 200 GB/day
18TIGGE - Status
- Very early stages
- NCAR and ECMWF hope to initiate operational
exchange March-April 2006 - ECMWF/CPTEC/NCAR/Unidata tuned TIGGE IDD/LDM, 100
GB in 12 hours - Optimized the TCP/IP window size
- Enlarged the LDM receive and send buffer size
Red from NCAR Blue from ECMWF Green from
CPTEC
TIGGE IDD/LDM transfer rates, 17-19 January 2006
After tuning, 8 GB/hr
MB/hr
Before tuning, 3 GB/hr
19Wrap Up
- We have and will further expand data access
- Especially on RDA and CDP Servers
- Strongly linked - no need to know our
infrastructure - Reanalyses
- Continue to develop what we have
- More on the horizon
- Big on data curation and stewardship
- TIGGE - new resource for those interested in
ensemble weather forecasting
20- Thanks
- RDA Server
- http//dss.ucar.edu/
- CDP Server
- https//cdp.ucar.edu/