Title: Why so many data systems?
1Why so many data systems?
2Information as a Resource
3The Transformational Effect of Networking
Networking has led to an unprecedented surge of
productivity
Time Magazine,
Person of the Year 2006, YOU
- Information has become the main driver of
progress - Time and place are no longer barriers to
participation and interaction - The Web has become a medium participation - Web
2.0 phenomenon
- These are opportunities to enable Earth
Science through more networking - But many resistances to networking exist that
need to be overcome
4Networking Multiplies Value Creation
Enclosed Value-Creating Process - Stovepipe
Application
Data
1 User Stovepipe Value 1 1 Data x 1 Program
1
5Networking Multiplies Value Creation
Application
Application
Stovepipe
Application
Data
Application
Application
1 User Stovepipe Value 1 1 Data x 1 Program
1 5 Uses of Data Value 5 1 Data x 5
Program 5
6Networking Multiplies Value Creation
1 User Stovepipe Value 1 1 Data x 1 Program
1 5 Uses of Data Value 5 1 Data x 5
Program 5 Open Network Value 25 5 Data
x 5 Program 25
Merging data may creates new, unexpected
opportunities Not all data are equally valuable
to all programs
7The Network EffectLess Cost, More Benefits
through Data Multi-Use
Data repositories/Systems
Programs ask/get Data
Data
Orgs Develop Programs
Public sets up Orgs
Data
Program
Data
Organization
Program
Data Re-Use Network Effect
Public
Program
Organization
Data
Program
Data
Data
Pay only once Richer content
Less Prog. Cost More Knowledge
Less Soc. Cost More Soc. Benefit
Data are costly resource should be reused
(recycled) for multiple applications Data reuse
saves to programs and allows richer knowledge
creation Data reuse, like recycling takes some
effort labeling, organizing, distributing
8Data are costly resource should be reused
(recycled) for multiple applications Data reuse
saves to programs and allows richer knowledge
creation Data reuse, like recycling takes some
effort labeling, organizing, distributing
9Increasing the Size of the Pie
Cost 1.5 for 5 uses
Cost 1 for single use
Benefit 5 for 5 uses
Benefit 1 for single use
Data are costly resource should be reused
(recycled) for multiple applications Data reuse
saves to programs and allows richer knowledge
creation Data reuse, like recycling takes some
effort labeling, organizing, distributing
10Data Re-Use and Synergy
- Data producers maintain their own workspace and
resources (data, reports, comments). - Part of the resources are shared by creating a
common virtual resources. - Web-based integration of the resources can be
across several dimensions - Spatial scale Local global data sharing
- Data content Combination of data generated
internally and externally
- The main benefits of sharing are data re-use,
data complementing and synergy. - The goal of the system is to have the benefits of
sharing outweigh the costs.
11Federated Information System
- Data producers maintain their own workspace and
resources (data, reports, comments). - However, part of the resources are shared
through a Federated Information System. - Web-based integration of the shared resources can
be across several dimensions
- Data sharing federations
- Open GIS Consortium (GIS data layers)
- NASA SEEDS network (Satellite data)
- NSF Digital Government
- EPAs National Env. Info Exch. Network.
12Federated Information System
- Data producers maintain their own workspace and
resources (data, reports, comments). - However, part of the resources are shared
through a Federated Information System. - Web-based integration of the shared resources can
be across several dimensions
Applications
- Data sharing federations
- Open GIS Consortium (GIS data layers)
- NASA SEEDS network (Satellite data)
- NSF Digital Government
- EPAs National Env. Info Exch. Network.
Shared
Private
PM Policy
Regulation
RPO
Mitigation
RPO Federated Data System Data, Tools, Methods
Other Federations
Unidata Portal
ESIP Portal
Portal
Data to be dispersed to multiple portals This
brings data closer to the user Each portal can
serve different clientele Conditions is open
architecture so that the resources can be
reconfigured into many different views through
the different portals
User communities
13Smoke Event
Chem
EPA
NAAQS Exc. Events
SatTOMS
States AQ Warning
SatGOES
NOAA
PM25
Public
Travel Advisories
Vis
AQ Forecasting
FAA
Flight Advisories
Mod
SatModis
NASA
Earth Obs Public
1. 2. 3.
14Stovepipe and Federated Usage Architectures
Landscape
- Each project/program can be augmented by
Federation data and services
15- Applicable to
- Model Validation
- Deliver Information to the Public
- Track Trends Accountability
- GEOSS
16Data Acquisition and Usage Activities
Need similar generic pic for analysis
17Staged Data Integration? Staged portal
Virtual Int. Data
Oodle! CNet
System integrates foreword from provider to the
users So that user can find/monitor content User
can navigate backwards toward the provider PoP
harvester
18 Agile Information System Data Access,
Processing and Products
Data
Value Adding Processes
Organizing DocumentStructure/FormatInterfacing
19 Agile Information System Data Access,
Processing and Products
Data
Homogenizing Format profile Standard accessData
as Service
Value Adding Processes
Organizing DocumentStructure/FormatInterfacing
20 Agile Information System Data Access,
Processing and Products
Data
Homogenizing Format profile Standard accessData
as Service
Value Adding Processes
Organizing DocumentStructure/FormatInterfacing
Characterizing Display/BrowseCompare/Fuse
Characterize
21 Agile Information System Data Access,
Processing and Products
Data
Analyzing Filter/IntegrateAggregate/FuseCustom
Analysis
Homogenizing Format profile Standard accessData
as Service
Value Adding Processes
Organizing DocumentStructure/FormatInterfacing
Characterizing Display/BrowseCompare/Fuse
Characterize
22 Value-Adding Processes
Information Value Chain
Uniform Access
Data Processing Web Service Chain
Products Reports
SciFlo
Forecast
DataFed
Compli.
Data
Science
Custom Processing
Other
Analyzing Filter/IntegrateAggregate/FuseCustom
Analysis
Homogenizing Format profile Standard accessData
as Service
Organizing DocumentStructure/FormatInterfacing
Characterizing Display/BrowseCompare/Fuse
Characterize
Reporting Inclusiveness Iterative/Agile Dynamic
Report
23 Agile Information System Data Access,
Processing and Products
Control
Data
Control
Seeking Information
Negotiating Market Space
Data
Providing Information
24System of SystemsGlobal Earth Observing System
of Systems - GEOSS
- Characteristics of System of Systems (SoS)
- Autonomous constituents managed/operated
independently - Independent evolution of each constituent
- SoS displays emergent behavior
- Must recognize, manage, exploit the
characteristics - No stakeholder has complete SoS insight
- Central control is limited distributed control
is essential - Users, must be involved throughout the life of a
SoS
25Lets agree onSpace-Time-Parameter Data Access
Query Protocol
26Interoperability Stack Key concept of the Web
Connecting Machines and People
System components have to be interoperable at
each layer
Amplify Individuals Connect Minds
Service Orientation Open Architecture Data
Standards
IP Internet Protocol
27Loosely Coupled Data Access through Standard
Protocols
GetCapabilities
Server
Client
Standard Messaging What data you have? Give me
this data
Std. Interface
Std. Interface
Capabilities, Profile
Where? When? What? Which Format?
Back End
Front End
GetData
Data
Query GetData Standards
Where? BBOX OGC, ISO
When? Time OGC, ISO
What? Temperature CF
Format netCDF, HDF.. CF, EOS, OGC
Standard Data Query Language Where? When? What?
(Space-time query - WMS, WCS)
T2
T1
28Web Services and Workflow for Loose Coupling
Software Mashups
Workflow SoftwareDynamic Linking
Software MashupCoarse-grain Linking
29(No Transcript)
30Collaborative Reporting and Dynamic Delivery
Co Writing - Wiki
Collaborative Analysis and Writing Wiki, Blogs,
Group Annotations
Dynamic Content Delivery GoogleEarth,
Screencasting
31DataFed 100 Datasets Non-intrusively Federated
Near Real Time Data Integration Delayed Data
Integration Surface Air Quality AIRNOW O3,
PM25 ASOS_STI Visibility, 300
sites METAR Visibility, 1200 sites VIEWS_OL 40
Aerosol Parameters Satellite MODIS_AOT AOT, Idea
Project GASP Reflectance, AOT TOMS Absorption
Indx, Refl. SEAW_US Reflectance, AOT Model
Output NAAPS Dust, Smoke, Sulfate,
AOT WRF Sulfate Fire Data HMS_Fire Fire
Pixels MODIS_Fire Fire Pixels Surface
Meteorology RADAR NEXTRAD SURF_MET Temp, Dewp,
Humidity SURF_WIND Wind vectors ATAD Trajector
y, VIEWS locs.
- Data are accessed from autonomous, distributed
providers - DataFed wrappers provide uniform geo-time
referencing - Tools allow space/time overlay, comparisons and
fusion
32Sample of Federated Datasets
33(No Transcript)
34A Sample of Datasets Accessible through ESIP
MediationNear Real Time ( day)
MODIS Reflectance
MODIS AOT
TOMS Index
MODIS Fire Pix
GOES AOT
GOES 1km Reflec
NEXTRAD Radar
NRL MODEL
NWS Surf Wind, Bext
- It has been demonstrated (project FASTNET) that
these and other datasets can be accessed,
repackaged and delivered by AIRNow through
Consoles
35Summary Grand ConvergenceWill we make use of it?
- Third-party mediation can homogenize distributed
ES data - Agile SOA-based IS can deliver diverse info
products to users - Since 2005, one such IS, DataFed is used by EPA
and in research - However, more data need to be federated by the
community
Parting thoughts Think outside the stovepipe
Think networking Divide and Conquer, NO! Connect
and Enable, YES!
Thank you