Title: The EDEN System
1The EDEN System
- Jerry Fowler
- MCC
- Austin, Texas
2Environmental DataExchange Network(EDEN)
3Outline
- EDEN Project Overview
- InfoSleuth in a microsecond
- The Ontology in InfoSleuth
- Value Mapping and the Environmental Data Registry
- Further thoughts and work
Live Demonstration
4Outline
- EDEN Project Overview
- InfoSleuth in a microsecond
- The Ontology in InfoSleuth
- Value Mapping and the Environmental Data Registry
- Further thoughts and work
5Environmental Data Exchange Network
- The challenge
- Acquisition, use and dissemination of
environmental information is of increasing
strategic importance to EPA, DOD, DOE, and EEA - EDEN is an application of MCC's InfoSleuth
technology - Employs intelligent agent technology through the
Internet to conduct concept-based searches of
heterogeneous, distributed information - The EDEN Project demonstrates how organizations
can save time and money - Provides easy access over intranet or the
Internet - Enables users to access information from multiple
sources - Simplifies the exchange and sharing of data
- Reduces the reporting burden
- Brings information together for presentation and
analysis
6Sponsors Common Requirements
- Reduce the reporting burden imposed by the
parties on each other - Share best available and most timely information
- Enable users to access information from multiple
sources - Coordinate only the common vocabulary not the
end use of information resources focus on the
inputs with each participant individually
interpreting and communicating outputs
7EDEN Access to Distributed Databases
- Geographically distributed data resources
- Differing database software
- Differing logical schemas
- Not always available
8Outline
- EDEN Project Overview
- InfoSleuth in a microsecond
- The Ontology in InfoSleuth
- Value Mapping and the Environmental Data Registry
- Further thoughts and work
- InfoSleuth in a microsecond
9InfoSleuth
- Consortial project in the use of agent software
for distributed information management - Commercial sponsors
- General Dynamics Information Systems
- Rafael
- Raytheon
- SAIC
- Schlumberger
- Texas Instruments
- TRW
10InfoSleuth
- System of competent agents for dynamic,
scalable (SQL-based) access to heterogeneous
distributed information sources - Ontology-based information management
- Advertise-discover paradigm supported by
brokering over semantic constraints
11InfoSleuth System
- Java-based agents
- Knowledge Query Manipulation Language message
layer provides speech-act agent interface - Agent conversation shell provides structure for
KQML messages - Open KnowledgeBase Connectivity language provides
semantic communication layer - Brokering reasoning provided by Logical Data
Language, LDL (going away) - XML in the future
12Main InfoSleuth Agents
- Broker
- Matches agents based on semantic constraints
- Resource Agents
- Translate between application domain ontology and
database schemata - Multi-Resource Query Agent
- Supports query decomposition and result
recomposition - Value Mapper
- Translates to/from canonical value domains
- Portal Agent
- Provides user context and interface
13More about InfoSleuth Agents
- JDBC Resource agents translate between
application domain ontology and database schemata - Multi-resource query agent uses either Oracle or
native Java to support query decomposition and
result synthesis - Value mapper translates to/from canonical value
domains using EDR - Text agent supports ontology-based query
- Control agent manages CLIPS rule base for task
planning and execution - Sentinel and Deviation Detection agents cooperate
to detect complex event patterns
14Basic InfoSleuth Application Recipe
- 6 cups ontology
- 3 cups resource agent configuration
- 1-3 cups user interface development
- Lightly brown the multi-resource query agent
- Pour in other agents out of the box
- Stir and Serve...
- add or remove resource agents as desired
- add other functionality with more configuration
effort
15Outline
- EDEN Project Overview
- InfoSleuth in a microsecond
- The Ontology in InfoSleuth
- Value Mapping and the Environmental Data Registry
- Further thoughts and work
- The Ontology in InfoSleuth
16Purpose of The Ontology in InfoSleuth
- To describe the domain with minimal ambiguity
- the structure defines the domain
- documentation strings
- To be the integration hub for DB schemas
- query relaxation through the taxonomy
- vertical fragmentation
- multi-resource path expressions
- To define the preferred value domains of
attributes used in communications between agents - Value mapping may be necessary for translation of
queries and results
17Expressing the Ontology
- OKBC (Open Knowledge Base Connectivity) a
standard for Knowledge Representation - Classes, Slots, Facets
- (class Site_Contamination)
- (template-slot-of averaging_method
Site_Contamination) - (template-facet-value VALUE-TYPE
averaging_method Site_Contamination STRING) - (template-slot-of site Site_Contamination)
- (template-facet-value VALUE-TYPE site
Site_Contamination Eden_Site) - Subclass Linkage
18Ontology Definitions
19Ontological Concept
average_concentration_unit site_id source_db
recording_date contaminant average_concentratio
n medium averaging_method
Site contamination
Has attributes
20Concept mapping (CERCLIS3)
-
- SELECT
- ref_media.rmedia_desc
- constituent_contaminant
- .cc_avg_conc_value_nmbr
- ref_concentration_units
- .rconc_units_desc
- ref_hazardous_substance
- .rhs_nmbr
- constituent_contaminant
- .last_updated_date
- site.site_epa_id
- 'Reported in CERCLIS3
- 'cerclis
- FROM
- site, ...
- medium,
- average_concentration,
- average_concentration_unit,
- contaminant,
- recording_date,
- site_id,
- averaging_method,
- source_db
21Outline
- EDEN Project Overview
- InfoSleuth in a microsecond
- The Ontology in InfoSleuth
- Value Mapping and the Environmental Data Registry
- Further thoughts and work
Live Demonstration
22Sample Query
23Sample Query Results
24Locations of the Agents
25Outline
- EDEN Project Overview
- InfoSleuth in a microsecond
- The Ontology in InfoSleuth
- Value Mapping and the Environmental Data Registry
- Further thoughts and work
- Value Mapping and the Environmental Data Registry
26Value mapping requirements
- Translate terms in queries
- Allow users to choose a coding scheme for
querying - Query each database in terms of its own coding
scheme - Translate results of queries
- Facilitate merging of data from different sources
- Display results according to user preference
27The Value Mapping Model
28Value mapping and the ontology
- A class has one or more slots
- Each slot has a conceptual domain
- Each slot has a preferred value domain
- Resource Agents must advertise in the preferred
value domain - possibly translating to/from a different value
domain - Users may query and view data in a different
value domain - User Agent handles translation to/from preferred
value domain
29Value Mapping Capability
Query translation and interpretation
SELECT FROM site WHERE state TX Translated
to SELECT FROM site WHERE state Texas
Results translation
30Query Processing with Value Mapping
31EDR Structure
A specialized resource agent (map agent)
accesses a view of the EDR
32Linking EDR to EDEN ontology
33View of the EDR
CREATE VIEW edr_map (conceptual_domain,
cd_id, value_domain, vd_id, preferred_domain,
pd_id) AS SELECT emc.conceptual_domain,
emc.value_domain, pref.pv_nm,
act.pv_nm FROM edr_map_class emc, cd_vm_assoc a,
permissible_value pref, permissible_value
act WHERE a.cd_id emc.cd_id AND a.vm_id
act.vm_id AND a.vm_id pref.vm_id AND emc.vd_id
act.vd_id AND emc.pd_id pref.vd_id
34EDR lookup
SELECT preferred_value FROM edr_map WHERE
actual_value Benzene AND coding_scheme
chemical_name AND conceptual_domain
chemical_substance
35Modifications to EDR
- Downloaded files of permissible values for CAS
number and Chemical name (Merck index) from EPA
site - Assigned value meanings
- Created value domains for CAS code, CAS padded,
ycode loaded permissible values - Added 3 extra chemical names because Merck index
file was incomplete - Built data-driven value-map for environmental
media - De-normalized data for faster retrieval
36Value Mapping Enhancements
- Functional maps
- e.g., case sensitivity (ST LOUIS vs. St
Louis) - One-to-many maps
- e.g., Environmental media mapped
- Soil vs. Topsoil, Subsoil, Soil - unspecified,
SO, S
37Outline
- EDEN Project Overview
- InfoSleuth in a microsecond
- The Ontology in InfoSleuth
- Value Mapping and the Environmental Data Registry
- Further thoughts and work
Live Demonstration
38Outline
- Further thoughts and work
- EDEN Project Overview
- InfoSleuth in a microsecond
- The Ontology in InfoSleuth
- Value Mapping and the Environmental Data Registry
- Further thoughts and work
39Outstanding mapping issues
- No match in EDR for database value
- differences in case (Texas, TEXAS)
- CAS number format (dashes, leading zeros)
- word order (n-Propyl benzene, Benzene,
n-Propyl) - bad data
- Improved functional mapping
- artificial intelligence can be used in
functional value mapping - ontology-dependent heuristics...
- Approximate string matching
40Further adventures in Ontology
- Incorporate Terminology Reference System/GEMET
into EDEN - Enable Value Map Agent to configure itself
dynamically directly from EDR - Expand EDEN ontology to encompass water quality
for European 5th Framework EDEN-IW project
41Use of XML
- InfoSleuth 5.5 will use
- XML data transport
- XML semantic advertisement
- InfoSleuth should use
- XML ontology representation
- XML browser configuration
- XML transport layer
- Benefits
- One parser, not home-grown
- Easier incorporation of data, metadata
- Better expressivity
- Better interoperability
42DENIX EDEN Portal
43Summary
- The EDEN pilot system shows that InfoSleuth can
integrate existing databases - Value mapping (hence EDR) is crucial
- EDEN may be useful to its sponsor agencies in
identifying data quality issues and data gaps - EDEN has stimulated collaboration on metadata
among agencies - EDEN has showcased the utility of the EDR
- More work will lead to a better, broader system
44ShakespeareonInternet Agent Research
- I can call spirits from the vasty deep!
- Aye, and so can I, and so can any man,
- but
- will they come when you do call for them?