Title: IPRC Data and Product Servers
 1IPRC Data and Product Servers
in the context of GODAE http//apdrc.soest.hawaii
.edu Peter Hacker, Jim Potemra, Sharon 
DeCarlo International Pacific Research Center 
(IPRC), University of Hawaii CLIVAR/GODAE 
Meeting ECMWF 31 August, 1 September, 2006, 
Reading, UK  
 2GODAE Plan
(APDRC) 
 3APDRC Data Server System
User
Desktop
Application Software Matlab, IDL, ferret, GrADS, 
 Fortran, JOA, ncBrowse
Web Browser
http
http/data
APDRC Server  Storage System
LAS, EPIC
http/data
NFS
DODS/OPeNDAP Catalog Aggregation Server
http/data
Local data
Distributed Data
Remote data 1
Remote data 2
Remote data N 
 4GODAE Intercomparison
Compare distributed GODAE model outputs
Define region
Select time 
 5GODAE Intercomparison
JPLECCO-adjoint
GFDL 
 6(No Transcript) 
 7http//apdrc.soest.hawaii.edu/synthesis_evaluation
     The APDRC is archiving GODAE output for the 
intercomparison project. At present, our role 
has been to collect the output from the various 
 modeling centers (via ftp) and put them on our 
own server.  Once the individual centers have 
their products ready, they notify us by email and 
we ftp the data from their site.  We do not do 
any format conversion, nor do we do any 
subsetting.  As of today, we have about 7GB of 
output from nine model groups (CORE, ECCOa, 
ECCOb, ECCOd, ECMWF, INGV, MOVE, SODA and UKDP), 
as well as some observations covering the 
same_regions/times/variables as the model 
output. 
 8Benefits of having a central server         1.  
modeling centers don't need to keep track of 
various products (e.g., N. Pacific subsets) of 
their raw model output they can just keep the 
model output and let the collection center 
store/serve the "products"         2.  right now 
we are just serving the data via http/ftp it 
would be trivial to put these data sets on our 
opendap and/or LAS servers, and having all the 
data in one place makes intercomparisons much 
easier          3.  individual centers may have 
to worry about firewalls or other issues that we 
have been working with already          4.  we 
can use these data sets as a starting point for 
developing a model-data portal, an activity 
suggested by SAC 
 9Potential enhancements (given our participation) 
 could include          1.  once we have the 
output in consistent formats, we can 
 write/provide scripts, based on some approved 
metrics, to intercompare the output          
2.  we can also provide a more sophisticated web 
service that allows searching, LAS displays, 
comparison to other apdrc data sets, 
 etc.          3.  we could use our 
authentication system to restrict access to 
certain people/products 
 10 Important considerations          1.  we try 
to be an operational center, however, we do 
encounter some down-times, network problems, 
etc.  This may or may not be a problem for 
GODAE. Having a mirror site would help, but would 
also add complexity (i.e., making sure both 
sites have the same thing.).          2.  we 
will have to come up with a strategy for backing 
up the data.  At present the volume is trivial, 
but it could easily get very large.          3.  
we do not have the personnel to extract model 
output or convert to a specified format, so it 
will still be incumbent on the modeling centers 
 to provide the data in the required format and 
subsets.  And, since we are not (yet) involved 
in working with the data, there is no vested 
interest on our part to ensure the data quality 
or accuracy.         4.  if the sites that are 
doing analysis are running opendap, and we can 
 get scripts for subsetting the model fields, we 
could run these from our site. 
 11What we have learned    It worked much easier 
(for us) to have premade data sets made 
available.  We did not have to do any 
conversions, etc.  I think it is better for the 
 modeling centers to do that, since they are more 
familiar with their output.  If we did want to do 
that, we would need a person.  To me, the 
biggest challenge is to get an approved list of 
the subset (variables, lat/lon/time/depth 
ranges, etc.), followed by actually stripping out 
the subset from the larger run output, and then 
serving/archiving it is fairly easy.Each model 
provider could have an DODS/OPENDAP server. A 
(central) GODAE Product Server could provide 
value-added capabilities and products.