Harmonizing Semantics in EGovernment - PowerPoint PPT Presentation

1 / 74
About This Presentation
Title:

Harmonizing Semantics in EGovernment

Description:

... VoiceXML Web Services from Mark Forman and the Quad Council at FOSE, March 2002. ... for Information Technology and E-Government, Mark Forman, and the Quad ... – PowerPoint PPT presentation

Number of Views:125
Avg rating:3.0/5.0
Slides: 75
Provided by: brandn
Category:

less

Transcript and Presenter's Notes

Title: Harmonizing Semantics in EGovernment


1
Harmonizing Semantics in E-Government
  • Presentation to the Ontolog-Forum
    (http//ontolog.cim3.net)
  • Brand L. Niemann
  • U.S. Environmental Protection Agency Enterprise
    Architecture Team
  • CIO Councils Architecture and Infrastructure
    Committee (AIC)
  • Co-Chair, Semantic Interoperability Community of
    Practice (SICoP)
  • CIO Councils Best Practices Committee (Knowledge
    Management Working Group)
  • April 22, 2004

2
A Little History
  • Led a Team That Won the Special Award for
    Innovation with XML and VoiceXML Web Services
    from Mark Forman and the Quad Council at FOSE,
    March 2002.
  • Led the CIO Council XML Web Services Working
    Group from August 2002-October 2003
  • TopQuadrant led the Semantic Technologies for
    eGovernment Pilot.
  • TopQuadrant helped organize the very successful
    Semantic Technologies for eGovernment Conference
    at the White House Conference Center, September
    8, 2003.
  • The TopQuadrant Pilot and the CIO Councils
    Knowledge Management Working Group (Best
    Practices Committee) Helped Start the new
    Semantic Interoperability Community of Practice
    (SICoP).
  • The XML Web Services Working Group Pilots
    Supported the Development of the
  • Federal Enterprise Architectures (FEA) Data and
    Information Reference Model and Its Data
    Management Strategy and the
  • Government Enterprise Architecture Framework
    (GEAF) of the CIO Councils Architecture and
    Infrastructure Committee (AIC) Governance
    Subcommittee.

3
Organizational Relationships
Industry Advisory Council (IAC)
U.S. CIO Council
OMB - FEAPMO
Enterprise Architecture Special Interest Group
Architecture Infrastructure Committee
IT Workforce Connections
Best Practices Committee
WGs and CoPs
Subcommittees Governance Components Emerging
Technologies
Semantic Interoperability Community of Practice
Chief Architects Forum
4
Some Upcoming Events
  • Collaboration Expedition Workshop 31, April
    28th, National Science Foundation, Ballston,
    Virginia
  • Joint Workshop with SICoP on Multiple Taxonomies
  • See http//ua-exp.gov
  • Collaboration Expedition Workshop 32, May 11th,
    National Science Foundation, Ballston, Virginia
  • Workshop on Emerging Technology Innovations in
    Software Component Development, Reuse, and
    Management Applications to Government
    Enterprise Architecture (e.g. the new Chief
    Architects Forum CoP)
  • See http//ua-exp.gov
  • SICoP Monthly Meeting 2, May 19th, MITRE,
    Mclean, Virginia
  • Progress Reports on White Paper Modules (3),
    Collaboration Tools, Discussion of Common Upper
    Ontologies, etc.
  • See http//web-services.gov and http//km.gov
  • Fourth Quarterly Emerging Technology Components
    Conference, June 3rd, MITRE, McLean, Virginia
  • Populating the Service Grid with Service
    Components
  • See http//Componettechnology.org

5
An Upcoming Event
  • Joint Workshop with SICoP on Multiple Taxonomies,
    April 28th
  • Welcome
  • Organizer Michel Biezunski, Coolheads Consulting
  • The Semantic Web-What Is This Really About?
  • Renee Lewis, Pensare Group
  • Increased Knowledge Sharing and Mission Success
    Implementing Taxonomies for NASA
  • Jayne Dutra, Jet Propulsion Laboratory
  • Master and Relational Taxonomies
  • Kevin Hannon, Independent Consultant
  • Clustering of Search Results With and Without
    Taxonomies
  • Raul Valdez-Perez, Vivisimo, Inc.

6
An Upcoming Event
  • Joint Workshop with SICoP on Multiple Taxonomies,
    April 28th (continued)
  • Semantics, Ontologies, and the Semantic Web
  • Leo Obrst, The MITRE Corporation
  • How to Create Many Taxonomies That Integrate Into
    a Single Enterprise-Wide Taxonomy
  • Denise Bedford, The World Bank
  • Ontology Overview
  • Adam Pease, Independent Consultant
  • Issues in Negotiating Multiple Semantic Models
  • LeeEllen Friedland, The MITRE Corporation
  • Accessibility, Usability, and Preservation of
    Government Information
  • Eliot Christian, USGS and Chair, Categorization
    of Government Information Working Group of the
    Interagency Committee on Government Information
  • Open Dialogue
  • Steven Newcomb, Coolheads Consulting

7
A Past Event
  • SICoP Monthly Meeting 1, April 14th, Army CIOs
    Office, Crystal City, Virginia
  • Part 1 Community Business
  • Old Business
  • Minutes and Charter
  • Emerging Products
  • White Paper On Implementing the Semantic Web
  • Module 1 Harnessing the Power of Information
    Semantics (Jie-Hong Morrison, State Department)
  • Module 2 Exploring the Business Value of
    Semantic Interoperability (Irene Polikoff,
    TopQuadrant)
  • Module 3 Roadmap for Operationalizing the
    Semantic Web (Michael Daconta, Smart Data
    Associates) (Slides 13-14)
  • Army Knowledge Management Conference, August
    31-September 2nd, Semantic Web Track (need
    speakers).

Posted at http//web-services.gov, Past Meetings
and Presentations, April 14th
8
A Past Event
  • SICoP Monthly Meeting 1, April 14th, Army CIOs
    Office, Crystal City, Virginia (continued)
  • Part 2 Building Shared Understanding
  • Ontologies for Semantically Interoperable
    Systems, Leo Obrst, The MITRE Corporation
    (deferred to the next meeting) (Slides 9-12)
  • A Data and Information reference Model (DRM)
    Registry and Repository Pilot, Brand Niemann, US
    EPA (deferred to the next meeting)
  • Common Upper Ontology for Cross-Domain Semantic
    Interoperability, Jim Schoening, The U.S. Army
    Communications Electonics Command
  • Part 3 Launching/Building the Supported Community
    of Practice
  • Proposed CoP Development Process, Rick Morris, US
    Army CIO Office
  • Facilitated Discussion, Rick Morris and Brand
    Niemann, Co-Chairs

9
Tightness of Coupling Semantic Explicitness
Explicit, Loose
Far
Performance k / Integration_Flexibility
Modal Policies
Internet
Semantic Mappings
Semantic Brokers
OWL-S
Agent Programming
RDF/S, OWL
Peer-to-peer
Semantics Explicitness
Web Services UDDI, WSDL
Web Services SOAP
Community
Applets
XML, XML Schema
Data
Application
N-Tier Architecture EAI
Workflow Ontologies
Same Intranet
Conceptual Models
Enterprise
Middleware Web
Data Marts
Same Wide Area Network Client-Server
Data Warehouses
Same Local Area Network
Federated DBs
Distributed Systems OOP
Systems of Systems
Same OS
Same DBMS
Same Address Space
Same CPU
Linking
From Synchronous Interaction to Asynchronous
Communication
Same Programming Language
Compiling
Same Process Space
1 System Small Set of Developers
Local
Looseness of Coupling
Implicit, TIGHT
10
Dimensions of Interoperability Integration
Our interest lies here
Community
Enterprise
6 Levels of Interoperability
System
Semantic
Application
Syntactic
Component
Structural
Object
Data
3 Kinds of Integration
0
100
Interoperability Scale
11
Ontology Spectrum One View
strong semantics
weak semantics
12
Ontology Spectrum One View
strong semantics
Modal Logic
First Order Logic
Logical Theory
Is Disjoint Subclass of with transitivity property
Description Logic
DAMLOIL, OWL
UML
Conceptual Model
Is Subclass of
RDF/S
XTM
Extended ER
Thesaurus
Has Narrower Meaning Than
ER
DB Schemas, XML Schema
Taxonomy
Is Sub-Classification of
Relational Model, XML
weak semantics
13
The Smart Data Enterprise
Figure 2. Developer's Perspective on Data To the
application developer, the data evolution
timeline is viewed through the correlation of
programming paradigms with the relation of data
and code. From Designing the Smart-Data
Enterprise, Get prepared for the 10 ways that
semantic computing will impact enterprise IT, by
Michael C. Daconta, Posted November 28, 2003,
Enterprise Architect Magazine.
14
The Smart Data Enterprise
Figure 3. The Smart Data Continuum Data has
progressed through four stages of increasing
intelligence. (Reprinted with permission from The
Semantic Web A Guide to the Future of XML, Web
Services, and Knowledge Management John Wiley
Sons, 2003. From Designing the Smart-Data
Enterprise, Get prepared for the 10 ways that
semantic computing will impact enterprise IT, by
Michael C. Daconta, Posted November 28, 2003,
Enterprise Architect Magazine.
15
Abstract
  • The history and broader context of this work.
  • See Section 1.
  • The eGov Act of 2002 has two sections (207 212)
    which require more structure and interoperability
    for government data and information and work has
    begun in several committees and communities of
    practice.
  • See Section 2 (just a few highlights).
  • The new Semantic Web standards and technologies
    provide a way to accomplish the purposes of the
    eGov Act of 2002 and the FEA Data and Information
    Reference Model Data Management Strategy.
  • See Section 3 (will skip over for this group).
  • The work on repurposing the Statistical Abstract
    of the United States, 2003, into a DRM Registry
    and Repository illustrates how a number of
    objectives can be accomplished at the same time,
    including the highest priority of the CIO
    Councils Architecture and Infrastructure
    Committee, namely intergovernmental exchange of
    data and information.
  • See Section 4 (just a few highlights).
  • The additional pilots underway are outlined.
  • See Section 5.

16
Overview
  • 1. Introduction (slides 17-19).
  • 2. eGovernment Drivers The eGov Act of 2002 and
    the FEA Data and Information Reference Model
    (DRM) (slides 20-32).
  • 3. Semantic Technologies for eGovernment (slides
    33-49).
  • 4. Repurposing the Statistical Abstract of the
    United States, 2003, Into a DRM Registry and
    Repository (slides 50-72).
  • 5. Additional Pilots (slides 73-74).

17
1. Introduction
  • Repurposing of large documents with mixed content
    (text, tables, graphics, etc.) into XML content
    collections began with The Statistical Abstract
    of the United States (1999 Edition) as part of
    the FedStats.Net project to build a distributed
    network of statistical data and information using
    new XML standards and technology.
  • The Statistical Abstract of the United States was
    considered to be one of the best examples of
    "manual aggregation of government information"
    (from some 200 programs across about 70 agencies)
    that would benefit from a distributed XML-based
    content network that would leave the content in
    the hands of its originators and create a more
    "living document".
  • This work was recognized by OMB Associate
    Director for Information Technology and
    E-Government, Mark Forman, and the Quad Council
    with a Special Award for Innovation in the 2002
    CIO Showcase of Excellence for the use of XML in
    a distributed content network (renamed FedGov)
    and use of VoiceXML in providing universal access
    to emergency response information.

18
1. Introduction
  • More recently, the eGov Act of 2002's provisions
    for an Intergovernmental Committee on Government
    Information (ICGI) and Data Integration Pilots,
    the Federal Enterprise Architecture's Data and
    Information Reference Model (DRM) and its Data
    Management Strategy and the focus in the CIO
    Council's Architecture and Infrastructure
    Committee on Intergovernmental Data Exchange,
    have all be tied together in a new pilot that
    simultaneously accomplishes multiple objectives
    (see next slide).
  • This Smart Data Enterprise approach came from
    the Semantic Technologies for eGov Conference,
    September 8, 2003, at the White House Conference
    Center (in which the EPA CIO and her staff
    participated), and continues in the new CIO
    Councils Semantic Interoperability (Web
    Services) Community of Practice (SICoP) (see
    subsequent slides).

19
1. Introduction
  • (1) Repurposes government data and information
    into structured documents using new XML-based
    standards and technologies that facilitate reuse
    and exchange.
  • (2) Repurpose the data and information so that it
    can be readily decomposed into XML fragments (for
    text and tables) and RDF metadata (for graphics)
    that can be stored and referenced in a database
    and can be in turn repurposed into new documents
    that provide additional user-defined views of the
    data and information.
  • (3) Organize and categorize the repurposed data
    and information using taxonomies and even
    ontologies in semantic registries and
    repositories.
  • (4) Use "XML data islands", and RDF and OWL to
    add metadata, interoperability and semantic
    meaning to data and information to be reused and
    exchanged.
  • (5) Standardize the data element and XML tag
    names in a DRM registry and repository.
  • (6) Share these results with others that are
    working on Semantic Web and Technology
    Applications for eGovernment.

20
2. eGovernment Drivers
  • The eGov Act of 2002
  • SEC. 207. ACCESSIBILITY, USABILITY, AND
    PRESERVATION OF GOVERNMENT INFORMATION.
  • (a) PURPOSE.The purpose of this section is to
    improve the methods by which Government
    information, including information on the
    Internet, is organized, preserved, and made
    accessible to the public.
  • (b) DEFINITIONS.In this section, the term
  • (1) Committee means the Interagency Committee
    on Government Information established under
    subsection (c) and
  • (2) directory means a taxonomy of subjects
    linked to websites that
  • (A) organizes Government information on the
    Internet according to subject matter and
  • (B) may be created with the participation of
    human editors.

21
2. eGovernment Drivers
  • The eGov Act of 2002 (continued)
  • SEC. 207. ACCESSIBILITY, USABILITY, AND
    PRESERVATION OF GOVERNMENT INFORMATION.
  • (d) CATEGORIZING OF INFORMATION.
  • (1) COMMITTEE FUNCTIONS.Not later than 2 years
    after the date of enactment of this Act, the
    Committee shall submit recommendations to the
    Director on
  • (A) the adoption of standards, which are open to
    the maximum extent feasible, to enable the
    organization and categorization of Government
    information
  • (i) in a way that is searchable electronically,
    including by searchable identifiers and
  • (ii) in ways that are interoperable across
    agencies
  • (B) the definition of categories of Government
    information which should be classified under the
    standards and
  • (C) determining priorities and developing
    schedules for the initial implementation of the
    standards by agencies.

Note Received the 2002 CIO Council Showcase of
Excellence Special Innovation Award for XML Web
Services (VoiceXML and the FedGov Content
Network) in March 2002.
22
2. eGovernment Drivers
  • The eGov Act of 2002 (continued)
  • SEC. 212. INTEGRATED REPORTING STUDY AND PILOT
    PROJECTS.
  • (a) PURPOSES.The purposes of this section are
    to
  • (1) enhance the interoperability of Federal
    information systems
  • (2) assist the public, including the regulated
    community, in electronically submitting
    information to agencies under Federal
    requirements, by reducing the burden of duplicate
    collection and ensuring the accuracy of submitted
    information and
  • (3) enable any person to integrate and obtain
    similar information held by 1 or more agencies
    under 1 or more Federal requirements without
    violating the privacy rights of an individual.

23
2. eGovernment Drivers
  • The eGov Act of 2002 (continued)
  • SEC. 212. INTEGRATED REPORTING STUDY AND PILOT
    PROJECTS.
  • (d) PILOT PROJECTS TO ENCOURAGE INTEGRATED
    COLLECTION AND MANAGEMENT OF DATA AND
    INTEROPERABILITY OF FEDERAL INFORMATION SYSTEMS.
  • (1) IN GENERAL.In order to provide input to the
    study under subsection (c), the Director shall
    designate, in consultation with agencies, a
    series of no more than 5 pilot projects that
    integrate data elements. The Director shall
    consult with agencies, the regulated community,
    public interest organizations, and the public on
    the implementation of the pilot projects.
  • (2) GOALS OF PILOT PROJECTS.
  • (A) IN GENERAL.Each goal described under
    subparagraph
  • (B) shall be addressed by at least 1 pilot
    project each.
  • (B) GOALS.The goals under this paragraph are to
  • (i) reduce information collection burdens by
    eliminating duplicative data elements within 2 or
    more reporting requirements
  • (ii) create interoperability between or among
    public databases managed by 2 or more agencies
    using technologies and techniques that facilitate
    public access and
  • (iii) develop, or enable the development of,
    software to reduce errors in electronically
    submitted information.

24
2. eGovernment Drivers
  • The Federal Enterprise Architecture (FEA) Data
    and Information Reference Model (DRM)
  • Volume 1 Bob Haycock, OMB Chief Architect, will
    soon release with guidance to the agencies.
  • The E-Government Act 2002, Section 207,
    Interagency Committee on Government Information,
    will use top two layers of the DRM structure for
    categorization of government information (see
    next slide).
  • The E-Government Act 2002, Section 212, calls for
    a series of no more than 5 pilot projects that
    integrate data elements to encourage integrated
    collection and management of data and
    interoperability of Federal Information systems.
  • Data Management Strategy In process and draft
    to be released soon.
  • Have several critiques of the ISO 11179 to
    improve the DRM Model including the suggested use
    of the Meta Object Facility (MOF) from the Object
    Management Group (OMG) by MetaMatrix (see slide
    16).
  • Volumes 2-4 To Be Released by Fall 2004 (see
    slides 17-19).
  • DRM business context, DRM information exchange,
    and DRM data elements.

25
The Current DRM Model
  • A model for discovery of information
  • Context and classification.
  • To determine available packages and elements.
  • A model for exchange of information
  • Information packages, built from common data
    elements.
  • Sharing mechanism.
  • A model for representation of information
  • Data elements defined in standard way.

ISO 11179
26
Expanding the DRM Model
DRM Model
MetaMatrix Model
  • MetaMatrix vision
  • Generic classification to tag metadata with
    context
  • vs. 2-level context.
  • Packages built from complex datatypes and
    deployable for exchange or data access
  • vs. exchange-only packaging of ISO 11179 data
    elements.
  • Formal datatype model
  • vs. more conceptual ISO 11179 model.
  • Formal reference information to add semantic
    value to data definitions
  • vs. nothing.

BUSINESS CONTEXT
CLASSIFICATION
Subject Area
Context
Super Type
Category
BUSINESS DATA FLOW
PACKAGE
Virtual Database
Exchange Package
Info Exch Package
INSTANCE
Virtual
Transform
Physical
TYPE
DATA ELEMENT
Schema/Association
ISO 11179
Complex Datatype
Data Object
Abstract Datatype
Data Property
Simple Datatype
Data Representation
REFERENCE
Glossary
Thesaurus
Bibliography
27
2. eGovernment Drivers
28
2. eGovernment Drivers
29
2. eGovernment Drivers
30
2. eGovernment Drivers
  • The FEA DRM Data Management Strategy, Business
    Driver 4 Resolve Data Semantics Issues That
    Impede Community of Practice Work, Brand Niemann
    and Ken Gill
  • Introduction to Data Semantics.
  • Domain Data Harmonization Strategy.
  • Data Harmonization Guiding Principles (10).
  • Global Justice Information Sharing Initiative
    (Global) Example.
  • Increased Collaboration by Means of and with
    "Smart Data (Dacontas Declaration of Data
    Independence).
  • Recommendations.

Note See http//web-services.gov for details.
31
2. eGovernment Drivers
  • The FEA DRM has been and currently is the object
    of a series of pilot projects and collaborative
    work within the Communities of Practice
  • Open GIS Consortium (OGC)
  • Information Communities and Semantics WG (ICS
    WG)
  • http//www.opengis.org/groups/?iid50
  • Sustainable Intergovernmental Network Exchange
    (Global-Justice, Environmental Information-EPA,
    and Health IT Sharing (Health) (SINE)
  • Collaborative Work Environment
  • http//sine.cim3.net/
  • Intelligence Community Metadata Working Group (IC
    MWG)
  • http//www.xml.saic.com/icml/
  • CIO Councils (Best Practices Committee)
    Knowledge Management Working Group (KM.GOV)
  • Semantic (Web Services) Interoperability
    Community of Practice (SICoP)
  • See http//km.gov and http//web-services.gov/

32
2. eGovernment Drivers
  • The FEA DRM has been and currently is the object
    of a series of pilot projects and collaborative
    work within the Communities of Practice
    (continued)
  • E-Gov SmartServices
  • To join the group send an email to
    eGov_SmartServices-subscribe_at_yahoogroups.com with
    empty Subject and Body. You will then receive an
    email with a web link where you can select the
    subscription option.
  • Open International Forum on Business Ontology
  • ONTOLOG - collaborative work environment
  • http//ontolog.cim3.net/ (April 22nd
    presentation)
  • Semantic Technologies for E-Government, September
    8, 2003, White House Conference Center
  • http//www.topquadrant.com/conferences/tq_proceedi
    ngs.htm
  • 2nd Semantic Technologies for E-Government,
    September 8, 2004 (tentative).
  • University of Maryland MINDLab (Professor Jim
    Hendler) and TopQuadrant (Ralph Hodgson)
  • http//owl.mindswap.org/ and http//www.topquadran
    t.com/
  • TopMIND Tutorials with Government Data Examples,
    March 22-25, 2004
  • http//www.topquadrant.com/seminars/topmind.htm

33
3. Semantic Technologies for eGovernment
  • Web-Enabled Government 2004 Conference and
    Exhibition, Session 2-4, February 4th, 2004
    Understanding Semantic Web Technology by
    Professor Jim Hendler and Brand Niemann
  • (1) Tree of Knowledge Technologies and The
    Semantic Technology Layer Cake
  • (2) Where We Are
  • (3) Emerging Vendors Landscape Semantic
    Integration
  • (4) Semantic Technologies and Web Services
  • (5) The First Site on the Semantic Web
  • (6) Taxonomy
  • (7) Topic Maps
  • (8) RDF and Ontology Components
  • (9) RDF Syntax and Validator
  • (10) OWL Syntax and Functionality
  • (11) Some Educational Resources

Note Based on TopMIND Tutorials, November 3-4,
and December 3-4, 2003
34
3. Semantic Technologies for eGovernment
  • Jim Hendler is a Professor at the University of
    Maryland and the Director of Semantic Web and
    Agent Technology at the Maryland Information and
    Network Dynamics Laboratory. He holds joint
    appointments in the Department of Computer
    Science, the Institute for Advanced Computer
    Studies and the Institute for Systems Research,
    and he is also an affiliate of the Electrical
    Engineering Department. He has authored close to
    150 technical papers in the areas of artificial
    intelligence, robotics, agent-based computing and
    high performance processing.
  • Hendler was the recipient of a 1995 Fulbright
    Foundation Fellowship, is a member of the US Air
    Force Science Advisory Board, and is a Fellow of
    the American Association for Artificial
    Intelligence. As Chief Scientist and Program
    Manager at DARPA for the DAML program, he has
    been one of the major drivers in the creation of
    the Semantic Web, and continues to be a prominent
    player in the W3Cs Semantic Web Activity.

35
(1) Tree of Knowledge Technologies
Content Management Languages
Semantic Technology Languages
Process Knowledge Languages
AI Knowledge Representation
Software Modeling Languages
36
(1) The Semantic Technology Layer Cake
Source Dieter Fensel
37
(2) Where We Are
We Are Here
38
(3) Emerging Vendors Landscape Semantic
Integration
Current Support / Primary Strength
SU
Structured information
Ontoprise
OWL
SU
S
S
Network Inference
enLeague
Unicorn
S
Unstructured information
Expressivity and Semantic Power
Ontology Works
RDF
S
S
Miosoft
S
Celcorp
Modulant
S
Supports both
Contivo
S
S
XML
S
U
Vitria
MetaMatrix
SchemaLogic
IGS
Source Irene Polikoff, TopQuadrant, Positioning
Semantic Technologies The Emerging
Vendor Landscape, September 8, 2003.
Data and Schema
Run
time
Integration and
-
Management
Validation
Engine
Orchestration
Enterprise Support
39
(4) Semantic Technologies and Web Services
Semantic Web Services
Enterprise Ontology and Web Services Registry
Dynamic Resources
Semantic Web Services
Web Services
Static Resources
WWW
Semantic Web
Source Derived in part from two separate
presentations at the Web Services One Conference
2002 by Dieter Fensel and Dragan Sretenovic.
Interoperable Syntax
Interoperable Semantics
40
(5) The First Site on the Semantic Web
http//owl.mindswap.org
PhotoStuff Image Annotation Tool with OWL
41
(6) Taxonomy
Goals for enterprise taxonomies
Regardless of end goals, look to a future where
taxonomies interoperate (domains connect)
Expect new stakeholders to take an interest
but have their own viewpoints Technology
Recommendation RDF(S)
From Tim Berners-Lee, ISWC 2003
42
(6) What is a Taxonomy?
  • A taxonomy is a model of knowledge organized as a
    hierarchical arrangement (tree structure) of
    concepts
  • parent nodes denote more general ideas than their
    children.

A
B
43
(6) Types of Taxonomy
  • A taxonomy can be
  • A classification hierarchy, eg Natural Taxonomy
  • Unique Beginner (plant) -gt Life-Form (bush) -gt
    Generic (rose) -gt Specific (hybrid tea) -gt
    Varietal (Peace)
  • A part hierarchy (Meronomy)
  • A category hierarchy
  • Taxonomies can intersect intersection means
    there are different relationships at work

Reference D.A. Cruise, Lexical Semantics,
Cambridge University Press, 1986
44
(6) topSAIL/tdf Taxonomy Development
FrameworkA five-step method for taxonomy
development
1
2
3
4
5
  • Focus
  • What is the taxonomy for?
  • What business challenges will it overcome?
  • What results will it achieve?
  • How to measure stakeholder benefit?
  • Analysis
  • What is the context for the taxonomy?
  • What are the types sources of knowledge?
  • How does knowledge map to processes?
  • Design
  • What types of taxonomy concepts are needed?
  • What to do first?
  • What system capabilities are needed?
  • What will be the impact?
  • Is the taxonomy design correct, complete and
    consistent?
  • Construct
  • Have we enough content mapped?
  • How to connect taxonomies to content?
  • How to integrate with IT systems?
  • Deploy
  • How do we ensure there will be feedback for
    assessment?
  • Have we accomplished set objectives?
  • What should be done next?

45
(7) Topic Maps
The TAO of Topic Maps
  • Topic
  • The entry in a topic map that refers to a subject
    on the real world.
  • Topic Maps make a Plato-distinction between
    Things in the Real World (Subjects) and Things in
    the Topic Map world (Topics).
  • Association
  • Linkages between Topics.
  • Tosca was written by Puccini.
  • Occurrence
  • Topics occur in resources.
  • Resources indicated e.g., URLs
  • Types of Occurrence mention, illustration,
    article, etc.

Note See http//www.giuseppeverdi.it/verdi. Also
see http//www.coolheads.com/egov for merging of
topic maps.
46
(8) RDF and Ontology Components
Key Ontology Components
RDF Triple Components
The company sells batteries.
depiction
Image
knows
Person birthdate date Gender char
Object
Predicate
published
Subject
Resource
Predicate
Literal
works for
is-A
leads
Resource Description Framework
Leader
Organization
URI
Literal
Source The Semantic Web A Guide to the Future
of XML, Web Services, and Knowledge Management,
Wiley Technology Publishing, June 2003.
Property or Association
47
(9) RDF Syntax and Validator
Graph of the Data Model
Syntax
  • lt?xml version"1.0"?gt
  • ltrdfRDF xmlnsrdf"http//www.w3.org/1999/02/22-r
    df-syntax-ns"
  • xmlns"http//www.example.org"gt
  • ltrdfDescription rdfID"Jen"gt
  • ltrdftypegt
  • ltrdfDescription rdfabout"Person"gt
  • lt/rdftypegt
  • lthas namegtJen Golbecklt/has namegt
  • lthasJobgt
  • ltrdfDescription rdfabout"Job1"gt
  • ltemployergtGeorge Washington Universitylt/emplo
    yergt
  • ltpositiongtAdjunct Professorlt/positiongt
  • lthiredgtJuly 2001lt/hiredgt
  • ltsalarygt1lt/salarygt
  • lthoursPerWeekgt15lt/hoursPerWeekgt
  • lt/rdfDescriptiongt
  • lt/hasJobgt
  • ..

http//www.w3.org/RDF/Validator
48
(10) OWL Syntax and Functionality
  • lt?xml version"1.0"?gt
  • ltrdfRDF xmlnsrdf"http//www.w3.org/1999/02/22-r
    df-syntax-ns"
  • xmlnsrdfs"http//www.w3.org/2000/01/rdf-schema
    "
  • xmlnsowl"http//www.w3.org/2002/07/owl"
  • xmlns"http//www.example.org"
  • xmlbase"http//www.example.org/"gt
  • ltowlClass rdfID"Person"/gt
  • ltowlClass rdfID"Employee"gt
  • ltrdfssubClassOf rdfresource"Person"/gt
  • ltrdfslabelgtOur Cool Employee Classlt/rdfslabelgt
  • lt/owlClassgt
  • ltowlClass rdfID"Civil_Servant"gt
  • ltrdfssubClassOf rdfresource"Employee"/gt
  • ltrdfslabelgtOur Cool Civil Servant
    Classlt/rdfslabelgt
  • lt/owlClassgt
  • Applications for OWL
  • Markup for web pages and other web-based media.
  • Raw Data Sharing.
  • Web Services.
  • Media Markup
  • Google and other keyword searches are excellent
    because they can work with text.
  • Not likely to be much improved by semantic web.
  • Image searches are much worse than text searches.
  • No way to know what is happening in an image,
    what in it, what context it was taken, or who is
    doing what.
  • MP3 searches.
  • I want that song that was in the Mitsubishi
    commercial
  • Video search.
  • Challenges
  • Trust Provenance.
  • Visualization.

49
(11) Some Educational Resources
Dieter Fensel Ontologies A Silver Bullet for
Knowledge Management and Electronic Commerce,
Springer Verlag, 2001
Johan Hjelm, Creating the Semantic Web with
RDF, John Wiley, 2001
John Davies, Dieter Fensel Frank van Harmelen,
Towards the Semantic WEB Ontology Driven
Knowledge Management, John Wiley, 2002
Dieter Fensel, Wolfgang Wahlster, Henry
Lieberman, James Hendler (Eds.) Spinning the
Semantic Web Bringing the World Wide Web to Its
Full Potential, MIT Press, 2002
Michael C. Daconta, Leo J. Obrst, Kevin T. Smith
The Semantic Web A Guide to the Future of XML,
Web Services, and Knowledge Management, John
Wiley, 2003
Vladimir Geroimenko (Editor), Chaomei Chen
(Editor), Visualizing the Semantic Web,
Springer-Verlag, 2003
M. Klein and B. Omelayenko (eds.), Knowledge
Transformation for the Semantic Web, Vol. 95,
Frontiers in Artificial Intelligence and
Applications, IOS Press, 2003
Sheller Powers, Practical RDF, OReilly, 2003
50
4. Repurposing the Statistical Abstract of the
United States, 2003, Into a DRM Registry and
Repository
  • Overview
  • Steps in Repurposing the Data Tables
  • (1) Table in Adobe Reader 6.0.
  • (2) Define Basic XML Tags in XMLSPY 2004.
  • (3) Define XML Tags for Data Element Names in
    XMLSPY 2004.
  • (4) Markup the Table in XMLSPY 2004.
  • (5) Grid View in XMLSPY 2004.
  • (6) XML Table Database in Excel 2002.
  • (7) Create the HTML Interface.
  • (8) HTML Interface in Browser.
  • (9) XML Table Database in Browser.
  • Some Features of the DRM Registry and Repository
  • Note that it is embedded in the document itself,
    not separate!

51
4. Repurposing the Statistical Abstract of the
United States, 2003, Into a DRM Registry and
Repository
  • Overview
  • The methodology for repurposing the Statistical
    Abstract, 2003, documents (45 PDF files/14.2 MB)
    into a structured XML content collection was
    presented previously
  • See Past Meetings and Presentations at
    http//web-services.gov, November 18-19, 2003,
    Website Content Management for Government
    Conference, Invited Presentation on November 19th
    on "Repurposing Documents Into Semantic Web
    Services and Networks" (EPA Enterprise
    Integration Portal/Data Exchange Network Pilot),
    Doubletree Hotel, Arlington, VA. Also see
    Folio-to-XML Conversion and Webinar.
  • Current plans call for the completions of the
    repurposing of this document and continued work
    on state of the environment and national and
    community indicator reports.

52
Step 1. Table in Adobe Reader 6.0
Text Select Tool Highlight Table, Edit Copy,
Edit Paste to XML SPY 2004
53
Step 2. Define Basic XML Tags in XMLSPY 2004
  • ltTableTitlegt
  • ltTableHeadNotegt
  • ltTableBodygt
  • ltTableFootnotegt
  • ltTableSourcegt

54
Step 3. Define XML Tags for Data Element Names in
XMLSPY 2004
  • Census Date (Year, Month Day)
  • Resident Population (Number)
  • Resident Population (Number Per Square Mile of
    Land Area)
  • Resident Population Increase Over Preceding
    Census (Number)
  • Resident Population Increase Over Preceding
    Census (Percent)
  • Area (Square Miles) Total
  • Area (Square Miles) Land
  • Area (SquareMiles) Water
  • CensusDateYearMonthDay
  • ResidentPopulationNumber
  • ResidentPopulationPerSquareMileofLandArea
  • ResidentPopulationIncreaseOverPrecedingCensusNumbe
    r ResidentPopulationIncreaseOverPrecedingCensusPer
    cent
  • AreaSquareMilesTotal
  • AreaSquareMilesLand
  • AreaSquareMilesWater

The heart of the DRM Registry and Repository
for reuse!
55
Step 4. Markup the Table in XMLSPY 2004
Text View in XMLSPY 2004
56
Step 4. Markup the Table in XMLSPY
2004(continued)
Text View in XMLSPY 2004
57
Step 5. Grid View in XMLSPY 2004(like a
spreadsheet!)
Highlight Grid Table, Edit Copy as Structured
Text, and Paste to Excel.
58
Step 6. XML Table Database in Excel 2002
Highlight Table, Format Column AutoFit
Selection. Also spreadsheet-like data tables can
be pasted into XMLSPY 2004.
59
Step 7. Create the HTML Interface
Navigation Functionality (non-XML)
Note two references to statabs2003no1.xml.
60
Step 7. Create the HTML Interface(continued)
Data Element Names
XML Tag Names
Note this makes the XML table database
independent of the HTML presentation.
61
Step 8. HTML Interface in Browser
Link to XML File
Navigation Buttons
Can easy browse through long tables.
62
Step 9. XML Table Database in Browser
Can expand and collapse using and -.
The heart of the DRM Registry and Repository
for interoperable exchange.
63
Some Features of the DRM Registry and Repository
Taxonomy of Federal Statistical Data and
Information!
64
Some Features of the DRM Registry and Repository
Detailed of Table of Contents for Entire Document.
65
Some Features of the DRM Registry and Repository
Detailed Table of Contents for Each Section.
66
Some Features of the DRM Registry and Repository
Graphics can have RDF metadata.
67
Some Features of the DRM Registry and Repository
Tables are structured data (copy to Excel) and
available in XML
68
Some Features of the DRM Registry and Repository
Table copied to Excel from Browser
69
Some Features of the DRM Registry and Repository
Search within just one chapter of the entire
document.
70
Some Features of the DRM Registry and Repository
Better search than from conventional Internet
search engines.
71
Some Features of the DRM Registry and Repository
Appendix III on Limitations of the Data (Data
Quality) for Major Databases!
72
Some Features of the DRM Registry and Repository
Harmonization/Standardization of Data Element and
XML Tag Names
73
5. Additional Pilots
  • Where does the FEA go next?, Bob Haycock, Chief
    Architect, OMB, at the Chief Architects Forum,
    April 5, 2004
  • Complete the DRM.
  • Conduct DRM Community of Practice Pilots.
  • Continue to develop and implement further DRM
    volumes and FEA Data Management Strategy.
  • Etc.

74
5. Additional Pilots
  • Census Bureau/FedStats (Statistical Abstract of
    the US)
  • Lead original Line of Business (Data and
    Statistics) which was exempted so it became a
    logical selection for a best practice pilot!
  • National Indicator System and the Community
    Statistical System
  • GAO, CEQ, Community Indicator Consortium, etc.
  • Sustainable Intergovernmental Network Exchange
    (SINE)
  • Global Justice, EPA, Health, etc.
  • Intelligence Community Metadata Working Group (IC
    MWG)
  • XML Enablement Strategy and Tool Evaluation.
  • Componenttechnology.Org
  • Proposals from participants in this Community of
    Practice to Populate the Service Grid with
    Services Components.
  • Categorization of Government Information Working
    Group of the Interagency Committee on Government
    Information
  • GSA Office of Intergovernmental Solutions (Susan
    Turnbull) Outreach to Involve State and Local
    Governments.
  • University of Maryland MINDLab (Professor Jim
    Hendler) and TopQuadrant (Ralph Hodgson)
  • Semantic Markup and Tools for Government Content
    (getting content ready for them!).
Write a Comment
User Comments (0)
About PowerShow.com