Title: Dvoy Related Ideas
1Dvoy Related Ideas
2Data Acquisition and Usage Value Chain
3Data Processing Value Chain
4Information Processing Value Chain (Taylor, 1975)
Organizing Grouping Classifying Formatting
Displaying
Analyzing Separating Evaluating Interpreting
Synthesizing
Judging Options Quality Advantages Disadvantages
Deciding Matching goals, Compromising Bargaining
Deciding
- Resistances to Move Data
- Mechanical
- Personal
- Institutional
- Forces to Move Data
- one-shot to reusable form
- External force contracts
- Internal humanitarian, benefits
5DVOY (A Federated System for Finding,
Exploring and Analyzing Environmental
Data)(Unified Access to 4-Dimensional
Geo-Environmental Data through Web Services)
- Outline Prepared by
- Special Interest Group on Environmental Data
Integration - March 2002
- Coordinated by CAPITA
- Supported by
- NSF, EPA and NOAA
6The Researchers Challenge
The researcher cannot get access to the data if
he can, he cannot read them if he can read them,
he does not know how good they are and if he
finds them good he cannot merge them with other
data. Information Technology and the Conduct of
Research The Users View National Academy Press,
1989
- These resistances can be overcome through
- A catalog of distributed data resources for easy
data discovery - Uniform data coding and formatting for easy
access, transfer and merging - Rich and flexible metadata structure to encode
the knowledge about data - Powerful shared tools to access, merge and
analyze the data
7Data Catalog
- All the data in the system are to be distributed
on the Web and maintained by their custodians - The purpose of the catalog is to help finding and
and accessing the data - Catalog would be limited to data that can be
accessed/merged in DVOY
8Uniform Coding and Formatting of Distributed Data
- Data are now easily accessible through standard
Internet protocols, but the coding and formatting
of the data is very heterogeneous - On the other hand data sharing is most effective
if the codes/formats/protocols are uniform (e.g.
the Web formats and protocols ) - Re-coding and reformatting all the heterogeneous
data into universal form in their respective
server is unrealistic - An alternative is enrich the heterogeneous data
with uniform coding along the way from the
provider to the user. - A third party proxy server can perform the
necessary homogenization with the following
benefits - The data user interfaces with a simple universal
data query and delivery system (interface,
formats..) - The data provider does not need to change the
system gets additional security protection since
the data data accessed by the proxy - Reduced data flow resistances results in
increased overall data flow and data usage.
9OGC Web Service Interoperability Program Goals
- Promote interoperability. The interaction between
services should be completely platform and
language independent, based on XML - Enable just-in-time integration. The discovery,
access to and ad-hoc chaining of services should
be possible dynamically at runtime. - Reduce complexity. All components are services
with published capabilities (incl. ConMan?)
implementation is opaque. - Support legacy systems. Enable interoperability
by encapsulating existing components and exposing
them as services. - (Same as DVoy, isnt it? We could put it better
ourselves! RBH)
10Outline of anOpen, DistributedAir Quality Data
Integration and Analysis System
Notes prepared for a discussion with EPA NERL and
OAQPS December 1, 1998
11The Problem
- The researcher is not aware of the relevant data
- if he is aware, he can not access to them
- if he can access them, he can not read them
- if he can, he does not know how good they are
- if they are good he cannot merge them with other
data - and by the time he merges them, the data are
outdated - Based on Information Technology and the Conduct
of Research - The Users view
- National Academy Press, 1989
12AQ Data and Analysis Challenges and Opportunities
- Shift from primary to secondary pollutants. Ozone
and PM2,5 travel 500 miles across state or
international boundaries and their sources are
not well established - New Regulatory approach. Compliance evaluation
based on weight of evidence and tracking the
effectiveness of controls - Shift from command control to participatory
management. Inclusion of federal, state, local,
industry, international stakeholders. - Challenges
- Broader user community. The information systems
need to be extended to reach all the stakeholders
( federal, state, local, industry, international) - A richer set of data and analysis. Establishing
causality, weight of evidence, emissions
tracking requires more data and air quality
analysis - Opportunities
- Rich AQ data availability. Abundant high-grade
routine and research monitoring data from EPA,
NASA, NOAA and other agencies are now available. - New information technologies. DBMS, data
exploration tools and web-based communication now
allows cooperation (sharing) and coordination
among diverse groups.
13Challenges
- Broader user community. The information systems
need to be extended to reach all the stakeholders
( federal, state, local, industry, international) - A richer set of data and analysis. Establishing
causality, weight of evidence, emissions
tracking requires the analysis of air quality,
meteorology emissions and effects data.
Opportunities
- Rich AQ data availability. Abundant high-grade
routine and research monitoring data from EPA and
other agencies are now available. - New information technologies. DBMS, data
exploration tools and web-based communication now
allows cooperation (sharing) and coordination
among diverse groups.
14Recap Harnessing the Winds
- Secondary pollutants along with more open
environmental management style are placing
increasing demand on data analysis. Meanwhile,
rich AQ data sets and the computer and
communications technologies offer unique
opportunities. - It appears timely to consider the development of
a web-based, open, distributed air quality data
integration, analysis and dissemination system. - The challenge is learn how to harness the winds
of change as sailors have learned to use the
winds for going from A to B
15Standard Data Support System
- Data management systems, DBMS
- Data processing end exploration tools
- Presentation tools
16Data Flow and Processing
17Infrastructure support for a distributed system
- Data sharing standards. A set of open standards
for the sharing of AQ data, tools and reports.
Examples TCP/IP, HTML, XML, FGDC - Data catalog. A virtual centralized catalog with
search and retrieval facilities. Examples GCMD,
web-indexes - Web-based shared workspace. Place to share
comments, feedback, plans, ...
18Benefits of a Distributed and Shared System
- Access to data. Users can get data, tools,
reports out of the system for specific projects.
It can be a forum for the exchange of ideas,
peer-feedback etc. - Saving time and money. The data, tools and other
resources in the system could be leveraging the
dollars and time available for specific projects.
- Recycling Data. Data are costly resource. The
system can help managing, accessing and
documenting one's own data, and share it with
others for re-use.
19The Dvoy Project
- DVOY is Federated Information System for
heterogeneous, multidimensional datasets - Voyager is a generic graphic browser for the
federated DVOY data. - The initial Dvoy infrastructure is being
developed at CAPITA, with NSF support - Further services for data access, processing and
viewing are expected from the community - The project evolution is to ride 'web services
wave of the Internet - CAPITA Support
- NSF ITR Workgroup Collaboration Tool Aug 2001 -
Aug 2004 - EPA Web-based Visibility Aug 2001 - Apr 2003
- NOAA ASOS Visibility Sep 2001 - Sep 2002
- MARAMA Chemical Trajectory Tool Aug 2002 - July
2003 - EPA OAQPS Global Transport Analysis Nov 2002
Oct 2003 - NSF DigiGov Fire and Smoke Network May 2003
Apr 2006 Pending - NASA ESE Satellite Appl. to PM Managment June
2003 May 2008 Pending - In-kind support by organizations participating in
DVOY-based federated data sharing - Collaborators CIRA (Schichtel), NRL(Westphal),
NASA (Goddard)many data sources.
20DATAFED Catalog Maintenance System
21(No Transcript)