Title: Three perspectives to GlobIS
1(No Transcript)
2Three perspectives to GlobIS
3Evolving targets and approaches in
integrating data and information (a personal
perspective)
Infocosm
4- Data recognized as corporate resource
leverage it! - Data predominantly in structured databases,
different data models, transitioning from
network and hierarchical to relational DBMSs - Heterogeneity (system, modeling and schematic)
as well as need to support autonomy posed
main challenges major issues were data
access and connectivity - Information integration through Federated
architecture - Support for corporate IS applications as the
primary objective, update often required,
data integrity important
5Generation I
(heterogeneity in FDBMSs)
6Generation I
(Federated Database Systems Schema Architecture)
- Dimensions for interoperability and
integration distribution, autonomy and
heterogeneity
- Model Heterogeneity Common/Canonical
Data Model Schema Translation - Information sharing while preserving
autonomy
7Generation I
(characterization of schematic conflicts in
multidatabase systems)
Sheth Kashyap, Kim Seo
8Generation I
(observations and lessons learnt)
- tightly coupled vs loosely coupled debate
we were not able to develop global schema
based systems - good common data model debate we were
not able to pick the best data model - can we have a metadata standard for a domain?
- only for a limited purpose
- must learn to live with multiple data types,
multiple metadata models/standards, and
multiple ontologies
9- Significant improvements in computing and
connectivity (standardization of protocol,
public network, Internet/Web) remote data access
as given - Increasing diversity in data formats, with
focus on variety of textual data and
semi-structured documents - Many more data sources, heterogeneous
information sources, but not necessarily
better understanding of data - Use of data beyond traditional business
applications mining warehousing,
marketing, e-commerce - Web search engines for keyword based querying
against HTML pages attribute-based querying
available in a few search systems - Use of metadata for information access early
work on ontology support distribution
applied to metadata in some cases - Mediator architecture for information
management
10Generation II
(limited types of metadata, extractors, mappers,
wrappers)
Find Marketing Manager positions in a company
that is within 15 miles of San Francisco and
whose stock price has been growing at a rate of
at least 25 per year over the last three
years Junglee, SIGMOD Record, Dec. 1997
EXTRACTORS
METADATA
11Generation II
(a metadata classification the informartion
pyramid)
- METADATA STANDARDS
- General Purpose
- Dublin Core, MCF
- Domain/industry specific
- Geographic (FGDC, UDK, ),
- Library (MARC,)
12VisualHarness an example
13Whats next (after comprehensive use of metadata)?
Query processing and information requests
14GIS Data Representation Example
multiple heterogeneous metadata models with
different tag names for the same data in the same
GIS domain
Kansas State
15- Increasing information overload and broader
variety of information content (video
content, audio clips etc) with increasing amount
of visual information, scientific/engineering
data - Continued standardization related to Web for
representational and metadata issues (MCF,
RDF, XML) - Changes in Web architecture distributed
computing (CORBA, Java) - Users demand simplicity, but complexities
continue to rise - Web is no longer just another information
source, but decision support through data
mining and information discovery, information
fusion, information dissemination, knowledge
creation and management, information management
complemented by cooperation between the
information system and humans - Information Brokering Architecture proposed for
information management
16Information Brokering An Enabler for the
Infocosm
INFORMATION/DATA OVERLOAD
17Information Brokering Three Dimensions
Objective Reduce the problem of knowing
structure and semantics of data in the
hugenumber of information sources on a global
scale to understanding andnavigating a
significantly smaller number of domain ontologies
18What else can Information Brokering do?
19Concepts, tools and techniques to support
semantics
semanticproximity
context
inter-ontologicalrelations
media-independentinformation correlations
ontologies(esp. domain-specific)
profiles
domain-specific metadata
20Tools to support semantics
- Context, context, context
- Media-independent information correlations
- Multiple ontologies
- Semantic Proximity (relationships between
concepts within and across ontologies) using
domain, context, modeling/abstraction/representati
on, state - Characterizing Loss of Information incurred due
to differences in vocabulary
BIG challenge identifying relationship
or similarity between objects of different media,
developed and managed by different persons and
systems
21Information Brokering over Heterogeneous Digital
Data A Metadata-based Approach
- Systems Heterogeneity information system
heterogeneity (DBMSs, concurrency control)
platform Heterogeneity (operating systems,
hardware) - Syntactic Heterogeneity different formats
and storage for digital media machine readable
aspects of data representation - Structural Heterogeneity heterogeneity in
data model constructs schematic/representati
onal heterogeneity - Semantic Heterogeneity
terminological/vocabulary heterogeneity
contextual heterogeneity
- Information Resource Discovery
- which/where are the relevant information
sources ? - Modeling of information Content
- increasing number of modeling
possibilities - Querying of Information Content
- Information Focusing
- Information Correlation
- combinatorial combinations of
combining/subsetting information
22Heterogeneity...
is a Babel Tower!!