The EDEN System - PowerPoint PPT Presentation

1 / 44
About This Presentation
Title:

The EDEN System

Description:

MCC. Microelectronics and Computer Technology Corporation. Metadata Open Forum, ... pref.pv_nm, act.pv_nm. FROM edr_map_class emc, cd_vm_assoc a, ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 45
Provided by: jfow6
Category:
Tags: eden | pref | system

less

Transcript and Presenter's Notes

Title: The EDEN System


1
The EDEN System
  • Jerry Fowler
  • MCC
  • Austin, Texas

2
Environmental DataExchange Network(EDEN)
3
Outline
  • EDEN Project Overview
  • InfoSleuth in a microsecond
  • The Ontology in InfoSleuth
  • Value Mapping and the Environmental Data Registry
  • Further thoughts and work

Live Demonstration
4
Outline
  • EDEN Project Overview
  • InfoSleuth in a microsecond
  • The Ontology in InfoSleuth
  • Value Mapping and the Environmental Data Registry
  • Further thoughts and work
  • EDEN Project Overview

5
Environmental Data Exchange Network
  • The challenge
  • Acquisition, use and dissemination of
    environmental information is of increasing
    strategic importance to EPA, DOD, DOE, and EEA
  • EDEN is an application of MCC's InfoSleuth
    technology
  • Employs intelligent agent technology through the
    Internet to conduct concept-based searches of
    heterogeneous, distributed information
  • The EDEN Project demonstrates how organizations
    can save time and money
  • Provides easy access over intranet or the
    Internet
  • Enables users to access information from multiple
    sources
  • Simplifies the exchange and sharing of data
  • Reduces the reporting burden
  • Brings information together for presentation and
    analysis

6
Sponsors Common Requirements
  • Reduce the reporting burden imposed by the
    parties on each other
  • Share best available and most timely information
  • Enable users to access information from multiple
    sources
  • Coordinate only the common vocabulary not the
    end use of information resources focus on the
    inputs with each participant individually
    interpreting and communicating outputs

7
EDEN Access to Distributed Databases
  • Geographically distributed data resources
  • Differing database software
  • Differing logical schemas
  • Not always available

8
Outline
  • EDEN Project Overview
  • InfoSleuth in a microsecond
  • The Ontology in InfoSleuth
  • Value Mapping and the Environmental Data Registry
  • Further thoughts and work
  • InfoSleuth in a microsecond

9
InfoSleuth
  • Consortial project in the use of agent software
    for distributed information management
  • Commercial sponsors
  • General Dynamics Information Systems
  • Rafael
  • Raytheon
  • SAIC
  • Schlumberger
  • Texas Instruments
  • TRW

10
InfoSleuth
  • System of competent agents for dynamic,
    scalable (SQL-based) access to heterogeneous
    distributed information sources
  • Ontology-based information management
  • Advertise-discover paradigm supported by
    brokering over semantic constraints

11
InfoSleuth System
  • Java-based agents
  • Knowledge Query Manipulation Language message
    layer provides speech-act agent interface
  • Agent conversation shell provides structure for
    KQML messages
  • Open KnowledgeBase Connectivity language provides
    semantic communication layer
  • Brokering reasoning provided by Logical Data
    Language, LDL (going away)
  • XML in the future

12
Main InfoSleuth Agents
  • Broker
  • Matches agents based on semantic constraints
  • Resource Agents
  • Translate between application domain ontology and
    database schemata
  • Multi-Resource Query Agent
  • Supports query decomposition and result
    recomposition
  • Value Mapper
  • Translates to/from canonical value domains
  • Portal Agent
  • Provides user context and interface

13
More about InfoSleuth Agents
  • JDBC Resource agents translate between
    application domain ontology and database schemata
  • Multi-resource query agent uses either Oracle or
    native Java to support query decomposition and
    result synthesis
  • Value mapper translates to/from canonical value
    domains using EDR
  • Text agent supports ontology-based query
  • Control agent manages CLIPS rule base for task
    planning and execution
  • Sentinel and Deviation Detection agents cooperate
    to detect complex event patterns

14
Basic InfoSleuth Application Recipe
  • 6 cups ontology
  • 3 cups resource agent configuration
  • 1-3 cups user interface development
  • Lightly brown the multi-resource query agent
  • Pour in other agents out of the box
  • Stir and Serve...
  • add or remove resource agents as desired
  • add other functionality with more configuration
    effort

15
Outline
  • EDEN Project Overview
  • InfoSleuth in a microsecond
  • The Ontology in InfoSleuth
  • Value Mapping and the Environmental Data Registry
  • Further thoughts and work
  • The Ontology in InfoSleuth

16
Purpose of The Ontology in InfoSleuth
  • To describe the domain with minimal ambiguity
  • the structure defines the domain
  • documentation strings
  • To be the integration hub for DB schemas
  • query relaxation through the taxonomy
  • vertical fragmentation
  • multi-resource path expressions
  • To define the preferred value domains of
    attributes used in communications between agents
  • Value mapping may be necessary for translation of
    queries and results

17
Expressing the Ontology
  • OKBC (Open Knowledge Base Connectivity) a
    standard for Knowledge Representation
  • Classes, Slots, Facets
  • (class Site_Contamination)
  • (template-slot-of averaging_method
    Site_Contamination)
  • (template-facet-value VALUE-TYPE
    averaging_method Site_Contamination STRING)
  • (template-slot-of site Site_Contamination)
  • (template-facet-value VALUE-TYPE site
    Site_Contamination Eden_Site)
  • Subclass Linkage

18
Ontology Definitions
19
Ontological Concept
average_concentration_unit site_id source_db
recording_date contaminant average_concentratio
n medium averaging_method
Site contamination
Has attributes
20
Concept mapping (CERCLIS3)
  • SELECT
  • ref_media.rmedia_desc
  • constituent_contaminant
  • .cc_avg_conc_value_nmbr
  • ref_concentration_units
  • .rconc_units_desc
  • ref_hazardous_substance
  • .rhs_nmbr
  • constituent_contaminant
  • .last_updated_date
  • site.site_epa_id
  • 'Reported in CERCLIS3
  • 'cerclis
  • FROM
  • site, ...
  • medium,
  • average_concentration,
  • average_concentration_unit,
  • contaminant,
  • recording_date,
  • site_id,
  • averaging_method,
  • source_db

21
Outline
  • EDEN Project Overview
  • InfoSleuth in a microsecond
  • The Ontology in InfoSleuth
  • Value Mapping and the Environmental Data Registry
  • Further thoughts and work

Live Demonstration
22
Sample Query
23
Sample Query Results
24
Locations of the Agents
25
Outline
  • EDEN Project Overview
  • InfoSleuth in a microsecond
  • The Ontology in InfoSleuth
  • Value Mapping and the Environmental Data Registry
  • Further thoughts and work
  • Value Mapping and the Environmental Data Registry

26
Value mapping requirements
  • Translate terms in queries
  • Allow users to choose a coding scheme for
    querying
  • Query each database in terms of its own coding
    scheme
  • Translate results of queries
  • Facilitate merging of data from different sources
  • Display results according to user preference

27
The Value Mapping Model
28
Value mapping and the ontology
  • A class has one or more slots
  • Each slot has a conceptual domain
  • Each slot has a preferred value domain
  • Resource Agents must advertise in the preferred
    value domain
  • possibly translating to/from a different value
    domain
  • Users may query and view data in a different
    value domain
  • User Agent handles translation to/from preferred
    value domain

29
Value Mapping Capability
Query translation and interpretation
SELECT FROM site WHERE state TX Translated
to SELECT FROM site WHERE state Texas
Results translation
30
Query Processing with Value Mapping
31
EDR Structure
A specialized resource agent (map agent)
accesses a view of the EDR
32
Linking EDR to EDEN ontology
33
View of the EDR
CREATE VIEW edr_map (conceptual_domain,
cd_id, value_domain, vd_id, preferred_domain,
pd_id) AS SELECT emc.conceptual_domain,
emc.value_domain, pref.pv_nm,
act.pv_nm FROM edr_map_class emc, cd_vm_assoc a,
permissible_value pref, permissible_value
act WHERE a.cd_id emc.cd_id AND a.vm_id
act.vm_id AND a.vm_id pref.vm_id AND emc.vd_id
act.vd_id AND emc.pd_id pref.vd_id
34
EDR lookup
SELECT preferred_value FROM edr_map WHERE
actual_value Benzene AND coding_scheme
chemical_name AND conceptual_domain
chemical_substance
35
Modifications to EDR
  • Downloaded files of permissible values for CAS
    number and Chemical name (Merck index) from EPA
    site
  • Assigned value meanings
  • Created value domains for CAS code, CAS padded,
    ycode loaded permissible values
  • Added 3 extra chemical names because Merck index
    file was incomplete
  • Built data-driven value-map for environmental
    media
  • De-normalized data for faster retrieval

36
Value Mapping Enhancements
  • Functional maps
  • e.g., case sensitivity (ST LOUIS vs. St
    Louis)
  • One-to-many maps
  • e.g., Environmental media mapped
  • Soil vs. Topsoil, Subsoil, Soil - unspecified,
    SO, S

37
Outline
  • EDEN Project Overview
  • InfoSleuth in a microsecond
  • The Ontology in InfoSleuth
  • Value Mapping and the Environmental Data Registry
  • Further thoughts and work

Live Demonstration
38
Outline
  • Further thoughts and work
  • EDEN Project Overview
  • InfoSleuth in a microsecond
  • The Ontology in InfoSleuth
  • Value Mapping and the Environmental Data Registry
  • Further thoughts and work

39
Outstanding mapping issues
  • No match in EDR for database value
  • differences in case (Texas, TEXAS)
  • CAS number format (dashes, leading zeros)
  • word order (n-Propyl benzene, Benzene,
    n-Propyl)
  • bad data
  • Improved functional mapping
  • artificial intelligence can be used in
    functional value mapping
  • ontology-dependent heuristics...
  • Approximate string matching

40
Further adventures in Ontology
  • Incorporate Terminology Reference System/GEMET
    into EDEN
  • Enable Value Map Agent to configure itself
    dynamically directly from EDR
  • Expand EDEN ontology to encompass water quality
    for European 5th Framework EDEN-IW project

41
Use of XML
  • InfoSleuth 5.5 will use
  • XML data transport
  • XML semantic advertisement
  • InfoSleuth should use
  • XML ontology representation
  • XML browser configuration
  • XML transport layer
  • Benefits
  • One parser, not home-grown
  • Easier incorporation of data, metadata
  • Better expressivity
  • Better interoperability

42
DENIX EDEN Portal
43
Summary
  • The EDEN pilot system shows that InfoSleuth can
    integrate existing databases
  • Value mapping (hence EDR) is crucial
  • EDEN may be useful to its sponsor agencies in
    identifying data quality issues and data gaps
  • EDEN has stimulated collaboration on metadata
    among agencies
  • EDEN has showcased the utility of the EDR
  • More work will lead to a better, broader system

44
ShakespeareonInternet Agent Research
  • I can call spirits from the vasty deep!
  • Aye, and so can I, and so can any man,
  • but
  • will they come when you do call for them?
Write a Comment
User Comments (0)
About PowerShow.com