Web Content Authoring and Management Tools with XML - PowerPoint PPT Presentation

1 / 77
About This Presentation
Title:

Web Content Authoring and Management Tools with XML

Description:

Janina Sajka and Katie Haritos-Shea, American Foundation for the Blind. ... PRICE $5.49 /PRICE /BOOK - Document Element (Root Element) BOOK ... – PowerPoint PPT presentation

Number of Views:129
Avg rating:3.0/5.0
Slides: 78
Provided by: Niem
Category:

less

Transcript and Presenter's Notes

Title: Web Content Authoring and Management Tools with XML


1
Web Content Authoring and Management Tools with
XML
  • FedWeb 2002 Tutorial, May 20, 2002
  • Brand Niemann and the Team
  • US Environmental Protection Agency
  • Office of Environmental Information

2
Overview
  • Preface
  • This tutorial is being videotaped by Susan
    Turnbull (GSA) for use in the Instant Access
    software at the Universal Access Collaboration
    Workshop on July 16th.
  • Part 1 (9 a.m.-12 noon)
  • XML (eXtensible Markup Language) 9-1015.
  • VoiceXML (XML for the telephone) 1030-1115.
  • GML (XML for geospatial databases) 1115-1145.
  • Questions and Answers 1145-12.
  • Part 2 (1-4 p.m.)
  • Web Content Management Strategies, Tools and Best
    Practices (Howard McQueen).

3
Part 1 The Team
  • XML
  • Brand L. Niemann (Sr.), US EPA.
  • Brand K. Niemann (Jr.), Tax Analysts.
  • VoiceXML
  • Art Clarke, Tellme Networks.
  • Janina Sajka and Katie Haritos-Shea, American
    Foundation for the Blind.
  • Simon Chung and Craig Brown, NextPage.
  • GML
  • Chris Tucker, Ionic Enterprise.
  • Videotape and Indexing
  • Susan Turnbull, GSA.
  • Antoinette Purdon, Instant Index.

4
Part 1 Training Materials
  • XML
  • This presentation others at http//130.11.44.140
    .
  • XML Web Services EPA-State Content Network
  • Web Publishing on DVD Repurposing Federal Data
    in XML.
  • VoiceXML
  • XML Web Services VoiceXML and Phone Directories
    and others by the Team members.
  • GML
  • GML and Open Web Services and others at
    http//www.ionicenterprise.com/.

5
Contents
  • 1. Background
  • 2. Creating Even Better Content from Good
    Content
  • 3. Managing Content as Collections and a Network
  • 4. Web Content Management Tools
  • 5. Contact Information

6
1. Background
  • 1.1 What have I done with XML?
  • 1.2 Why XML?
  • 1.3 What is XML?
  • 1.3.1 General.
  • 1.3.2 Parts of a Well-Formed XML Document.
  • 1.3.3 Supports Data Binding to HTML.
  • 1.3.4 Separates Content from Presentation.
  • 1.3.5 Supports Multi-Channel Dissemination.
  • 1.3.6 Supports Web Content Management and XML Web
    Services.
  • 1.3.7 Exchange Nodes and Content Networks.
  • 1.3.8 XML Training Resources.

7
1.1 What have I done with XML?
  • Federal CIO Working Group on XML, January 2001
  • XML Project Centralizes Agency Stats, Federal
    Computer Week, January 8, 2001.
  • FedWeb, March 2001
  • The Future of Portals Case Study of FedStats.Net
    as a Model for Collaboration and Data
    Integration.
  • Portals How e-Government is Transforming
    Communication with Citizens From Portals to Peer
    Space.
  • GSA Office of Intergovernmental Solutions
    Newsletter, XML Applications in Government,
    February 2002
  • Building Peer-to-Peer XML Content Networks of Web
    Services for Federal Scientific and Statistical
    Data and Information FedStats.Net and Beyond.
  • GAO Report to Congress on XML, April 2002
  • Challenges to Effective Adoption of the
    Extensible Markup Language (contributor).

8
1.2 Why XML?
  • The eXtensible Markup Language became a World
    Wide Web Consortium (W3C) standard in 1998 as the
    universal format for structured documents and
    data on the Web (http//www.w3.org/XML/).
  • The CIO Council created the XML Working Group in
    2000 to facilitate the efficient and effective
    use of XML through cooperative efforts among
    government agencies, including partnerships with
    commercial and industrial organizations
    (http//xml.gov/).
  • GAO report to Congress urges government to adopt
    XML (http//www.gao.gov/new.items/d02327.pdf).

9
1.3.1 What is XML?General
  • XML is a standard for preserving and
    communicating information encoding, tagging,
    and internationalizing that will be everywhere.
  • Web Services provide communication between
    applications running on different Web servers
    that will bring the Internet to its new level.
  • XML Web Services are applications running on
    different devices that communicate XML data using
    XML messages.
  • Web Services can and should be interoperable
    across multiple vendor tools and platforms in the
    enterprise (see http//www.ws-i.org/Community.aspx
    ).

10
1.3.2 What is XML? Parts of a Well-Formed XML
Document
  • XML
    Declaration
  • Comment
  • White Space
  • href"Inventory01.css"? Processing Instruction
  • End of Prolog
  • White Space
  • The Adventures of Huckleberry
    Finn
  • Mark Twain
  • mass market paperback
  • 298
  • 5.49
  • - Document Element (Root Element)
  • -
  • The Turn of the Screw
  • Henry James

11
1.3.3 What is XML?Supports Data Binding to HTML
Link an XML document to an HTML page and then
bind standard HTML elements to individual XML
elements which save time money on delivering
small Web databases. The XML file has many other
uses (e.g. Section 508 accessibility, roundtrip
to Excel, etc.) and future proofs your data
against periodic technology changes.
12
1.3.4 What is XML?Separates Content from
Presentation
Personalization Customer Relationship Management
Distributed
Content Network Uber Portal
Presentation
Traditional
Content Network Integrated Portal
Centralized
Centralized
Distributed
Content
13
1.3.5 What is XML?Supports Multi-Channel
Dissemination
  • Web
  • Print
  • CD/DVD
  • XML Web Service
  • Telephone
  • Digital Talking Books
  • Other

14
1.3.6 What is XML? Supports Web Content
Management and XML Web Services
  • XML indexing of PDF document collections.
  • Re-purposing PDF and Web documents to XML content
    collections.
  • Extracting and creating XML data tables from PDF
    and other Web documents.
  • Converting relational databases to XML and XML
    Web Services.
  • Delivering selected content to other channels
    like the telephone.
  • Converting spatial data to GML (Geography Markup
    Language) and integrating it with non-spatial XML
    content.
  • Centralized and distributed.

15
1.3.7 What is XML? Exchange Nodes and Content
Networkshttp//www.epa.gov/neengprg/
16
1.3.7 What is XML?Exchange Nodes and Content
Networkshttp//fedgov.nextpage.com
NXT 3 Interface
Search, Personalization, Document Management,
Metadata, etc.
Content Network Hierarchical Folders Each Can
be a Portal on Different Web Server!
Portlets
Portal (s)
Portlets
17
1.3.8 What is XML? XML Training Resources
  • Video
  • E.g., Introduction to XML Video (see next slide).
  • Commercial
  • E.g., Microsoft Visual Studio .NET, etc.
  • Online (free and cost)
  • E.g., xml.gov, xml.org, and xml.com
  • Develop in-house capability
  • E.g. EPA http//130.11.44.140.

18
1.3.8 What is XML?Introduction to XML Video
  • Chapter 1 XML in Business (20 minutes)
  • Chapter 2 History of XML (27 minutes)
  • Chapter 3 Theory of Markup (7 minutes)
  • Chapter 4 Introduction to XML Syntax (14
    minutes)
  • Chapter 5 XML in the Real World (6 minutes)
  • Chapter 6 Information Stewardship (4 minutes)
  • More Information (1 minute)
  • Purchase http//www.synthbank.com/xmlvideo.htm

19
1.3.8 What is XML? Key questions answered by
video
  • What is XML?
  • Who developed XML?
  • How is XML different from HTML?
  • Why is XML important to my business?
  • Can I begin to use XML today?
  • What tools and companies support XML?

20
2. Creating Even Better Content from Good Content
  • 2.1 CIA
  • World Fact Book Country Profiles.
  • Repurposing HTML to XML and creating new content
    presentations (XML HitList Table).
  • 2.2 Census Bureau
  • 2.2.1 Statistical Abstract of the United States.
  • Repurposing PDF and Excel to XML.
  • 2.2.2 USA County Spatial Statistics
  • Content Adapter for Relational Database to XML
    and Custom Search Form in XML.
  • 2.3 Comparison of Navigation and Searching.
  • 2.3.1 Census Bureau.
  • 2.3.2 NXT 3 Content Collection.

21
2.1 World Fact Book Country Profiles
22
2.1 World Fact Book Country Profiles
23
2.1 World Fact Book Country Profiles
24
2.1 World Fact Book Country Profiles
25
2.2.1 Statistical Abstract of the United States
26
2.2.1 Statistical Abstract of the United States
27
2.2.2 USA County Spatial Statistics
28
2.2.2 USA County Spatial Statistics
29
2.3 Comparison of Navigation and Searching
  • 2.3.1 Census Bureau
  • 2.3.1.1 42 Separate PDF files.
  • 2.3.1.2 No Search.
  • 2.3.1.3 1500 Excel Tables on a Separate CD-ROM.
  • 2.3.2 NXT 3 Content Collection
  • 2.3.2.1 Hierarchical Structure.
  • 2.3.2.2 Integration of Text and Tables.
  • 2.3.2.3 Excel Table Within the Document.
  • 2.3.2.4 Custom Search Form.
  • 2.3.2.5 Relevance Ranked Hit List.
  • 2.3.2.6 Hits Highlighted in Document and Tables.

30
2.3.1.1 42 Separate PDF files
31
2.3.1.2 No Search
32
2.3.1.3 1500 Excel Tables on a Separate CD-ROM
33
2.3.2.1 Hierarchical Structure
34
2.3.2.2 Integration of Text and Tables
35
2.3.2.3 Excel Table Within the Document
36
2.3.2.4 Custom Search Form
37
2.3.2.5 Relevance Ranked Hit List
38
2.3.2.6 Hits Highlighted in Document and Tables
39
2.3.2.6 Hits Highlighted in Document and Tables
40
3. Managing Content asCollections and a Network
  • 3.1 Definitions.
  • 3.2 Concepts
  • 3.2.1 The Uberportal and the NXT 3 Interface.
  • 3.2.2 Integration of Portals.
  • 3.3 Examples
  • 3.3.1 Housing and Urban Development.
  • 3.3.2 US Geological Survey.
  • 3.3.3 Environmental Protection Agency.
  • 3.4 FirstGov Content Management Survey
  • 3.4.1 General Questions (XML 1 of 12)
  • 3.4.2 Author Questions (XML 0 of 20)
  • 3.4.3 Advanced Questions (XML 4 of 22)
  • A portal that sits on top of the portals (The
    Gartner Group, Emerging Internet Technologies,
    Local Briefing, June 27, 2001, page 19).

41
3.1 Definitions
  • A content collection is a collection of one or
    more documents. Content collections may contain
    any type of content (documents, databases,
    applications, and other digital media), as well
    as considerable internal folder structure.
  • A site contains one or more content collections
    organized into hierarchies. When users access a
    site, they are presented with a site table of
    contents that allows them to browse the content
    collections stored on the site. They may also
    search across all content collections or a subset
    of the content collections. The result of a
    search is a list of matches that are linked to
    corresponding documents.
  • Note Special tools are used to build content
    collections or to create an empty content
    collection and check documents into this new
    collection (see Section 4).
  • Based on NextPage NXT 3. See NextPage NXT 3
    Quick Start.

42
3.2.1 The Uberportal the NXT 3 Interface
NXT 3 Interface
Search, Personalization, Document Management,
Metadata, etc.
Content Network Hierarchical Folders Each Can
be a Portal on Different Web Server!
Portlets
Portal (s)
Portlets
43
3.2.2 Integration of Portals
  • Web search engine-based technology and efforts
    help find and organize content for content
    networks using NXT 3 as follows
  • 1. Use the Web Content Service to crawl and index
    the contents of external Web sites to integrate
    their content.
  • 2. Use the Content Network Link to connect to
    other Web servers running NXT 3 to syndicate
    their content (Server P2P).
  • 3. Replicate the content of a Web server on a
    central Web server because of agency security
    constraints or other mitigating circumstances.
  • 4. Re-purpose or re-publish key content to
    improve its usability in a content network.
  • 5. XML-ize proprietary search engine indices.
  • 6. Use distributed content generation
    technologies to feed the content network from the
    grassroots level (Desktop P2P).

44
3.3.1 Housing and Urban Development Structured
documents
45
3.3.1 Housing and Urban Development Real data
from relational databases
46
3.3.1 Housing and Urban Development XML query
templates (housing by state)
47
3.3.1 Housing and Urban Development Real data
XML hitlist query results
48
3.3.2 US Geological SurveyAgency Publications
49
3.3.2 US Geological SurveyGeneral Interest
Publications
50
3.3.2 US Geological SurveySpecial Interest
Publications
51
3.3.2 US Geological SurveySearch across all
publications
52
3.3.3 Environmental Protection AgencyBackground
  • Requests from multiple EPA offices for help with
    XML training and pilots (financial, public
    relations, environmental information, superfund,
    research development, and water).
  • Select the very best content for each office to
    be XML-ized and to be integrated into a content
    network using the best technology.
  • Registered the best content with its metadata in
    the content network that is both centralized and
    distributed.
  • The content network supports the new agency
    initiatives like Environmental Indicators
    Initiative and State of the Environment Report,
    Environmental Health Tracking Network (EHTN), and
    the Situation Room.
  • The content network supports the agency goals of
    (1) creating the building blocks of an exchange
    network (2) enable integration of environmental
    data and (3) provide vital services to EPA and
    the public.

53
3.3.3 Environmental Protection AgencyEPA-State
Content Network
54
3.3.3 Environmental Protection Agency National
Coastal Condition Report
  • The Problem
  • Large PDF files (14) totaling 114.6 MB!
  • Files range in size from 0.1 17.2 MB.
  • Pages slow to render and print (200 pages)
    because of multi-colored backgrounds, graphics,
    and photographs.
  • Lots of data graphics, but few data tables.
  • Neither a structured table of contents PDF file
    nor in Tagged format for export to XML.
  • The Solution
  • NXT 3 makes search and display across the entire
    collection of files very efficient and fast
    because of XML.
  • http//www.epa.gov/owow/oceans/nccr/index.html

55
3.3.3 Environmental Protection Agency National
Coastal Condition Report
56
3.3.3 Environmental Protection Agency National
Coastal Condition Report
57
3.4 FirstGov Content Management Survey
  • 3.4.1 General Questions (XML 1 of 12)
  • 9. Intended use of XML How important is it for
    you to employ XML in the acquisition, management,
    and/or delivery of content?
  • 3.4.2 Author Questions (XML 0 of 20)
  • 3.4.3 Advanced Questions (XML 4 of 22)
  • 1. Need for XML tools How important is it for
    you for the system to have native XML processing
    tools and functions built in?

58
3.4 FirstGov Content Management Survey(continued)
  • 3.4.3 Advanced Questions (XML 4 of 22)
  • 2. Need for XML Standards Support How important
    is it for you to support XML-based standards such
    as RSS, ICE, ebXML, and the Web Services family
    (e.g. SOAP).
  • 3. Existing XML Usage Have you already
    developed DTDs or Schemas to validate your XML
    content?
  • 4. Current Usage of XML Stylesheets Have you
    already developed XSL Stylesheets to transforms
    your XML documents?

59
4. Web Content Management Tools
  • 4.1 Recent Evaluations
  • 4.2 NXT 3 e-Content P2P Platform
  • 4.2.1 Concepts.
  • 4.2.2 Architecture and Services.
  • 4.2.3 Content Network Manager.
  • 4.2.4 Manage Content (Interface for Document
    Management and Building Metadata).
  • 4.3 NextPage Apps and Tools
  • 4.3.1 Matrix (Virtual Collaborative Peer to Peer
    Space).
  • See Washington Post, January 3, 2002, page E01,
    Deals Become Online Models For Learning.
  • 4.3.2 Solo (Offline Access to the Content
    Network).
  • 4.3.3 RapidApps (Interwoven Integration).

60
4.1 Recent Evaluations
  • NextPage NXT 3 P2P Platform
  • Andy Warzecha, The META Group, 3/12/2002
  • If companies want to do cross-enterprise content
    management, NextPage has the solution
  • "Content networks provide a way for users to
    simultaneously access Internet sites, databases,
    intranets and other formal or informal content
    resources as if the content existed in a single
    location."
  • "The advantage of this approach is that new
    content sources can be added quickly ... This
    puts power in the hands of business users to
    quickly tie in or disconnect the various content
    sources they require access to." (see next slide)
  • Peer-to-peer Every device connected to the
    network is both a server and consumer of content.

61
4.1 Recent EvaluationsMETA Group Content
Network with Portal
62
4.1 Recent Evaluations
  • NextPage NXT 3 P2P Platform
  • Esther Dysons Release 1.0, 1/22/2002
  • NextPage is unique in the content-management
    market in its distributed approach
  • NetxPages platform, NXT 3, virtually connects
    the distributed information sources and makes
    them appear integrated to the user. Unlike
    syndication, in which content is copied and
    integrated with other content locally, NextPage
    keeps objects where they are.
  • NextPage uses the standard simple object access
    protocol (SOAP) to exchange and normalize
    information between local content directories,
    assembling meta-indexes so that users can search
    or manipulate content transparently, regardless
    of physical location.

63
4.2.1 Concepts
  • Folders can contains files, databases, and Web
    resources.
  • Folders can/should be on different Web servers,
    but look and function as though they are on the
    same Web server.
  • This is accomplished by two new XML-based
    standards that send lean XML messages between the
    Web servers
  • Content Network Protocol (CNP)
  • eXtensible Indexing Language (XIL)
  • Distributed folders and nodes can be managed both
    centrally and locally by the Content Network
    Manager and the Manage Content Administration
    Tools.

64
4.2.1 Concepts
65
4.2.2 Architecture and Services
66
4.2.2 Architecture and Services
67
4.2.3 Content Network Manager
  • The Content Network Manager is the graphic user
    interface utility for managing NXT 3 servers. In
    addition to providing a GUI interface to all
    server configuration information, and INI files
    associated with the NXT 3 product, Content
    Network Manager allows you to modify a site and
    server configuration without shutting down the
    server. Content Network Manager allows you to log
    in and manage networks of sites on the local host
    or on a remote server.

68
4.2.3 Content Network Manager
69
4.2.3 Content Network Manager
  • As an NXT 3 site administrator, you can control
    the organization of your site and what content to
    display. Others may provide content for you, but
    you determine how the content is integrated with
    your site. Typically, related items are gathered
    into collections or sub-collections to keep
    themes together. For example, you might use two
    primary collections on your siteone for
    internally created information and one for
    externally created information (purchased from a
    content publisher). Each of these general
    collections could then contain sub-collections.
    For example, your internal collection might
    contain accounting standards, policies and
    procedures, human resource information, and ISO
    9000 documents. Your external collection would
    probably be organized by the publisher of the
    information.

70
4.2.4 Manage ContentInterface for Document
Management and Building Metadata
  • Properties
  • A document stored in a content collection has a
    set of properties. NXT 3 supports these document
    element properties
  • ID
  • Name
  • Title
  • Hidden
  • Version
  • Content Type
  • Encoding
  • Compression
  • First-child-content
  • Index
  • Location
  • DSE
  • Indexsheet

71
4.2.4 Manage ContentInterface for Document
Management and Building Metadata
  • Metadata
  • NextPage recommends using the Resource
    Description Framework (RDF) for metadata. The
    metadata used by NXT 3 is specific metadata used
    for searching or resource discovery. Metadata
    support allows the defining, creating, storing,
    indexing, searching, retrieving, etc. of
    metadata. Metadata can exist within the resource
    that it is describing (internal metadata), or it
    can exist in a separate file (external metadata)
    that is associated with the content file. By
    default, NextPage uses the Dublin Core rules as a
    foundation for processing external metadata, as
    in Manage Content.

72
4.2.4 Manage ContentInterface for Document
Management and Building Metadata
73
4.3.1 MatrixVirtual Collaborative Peer to Peer
Space
74
4.3.2 SoloOffline Access to the Content Network
75
4.3.3 RapidAppsInterwoven Integration
76
5. Contact Information
  • Brand Niemann, Ph.D.
  • USEPA Headquarters, EPA West, Room 6143D
  • Office of Environmental Information, MC 2822T
  • 1200 Pennsylvania Avenue, NW, Washington, DC
    20460
  • 202-566-1657
  • niemann.brand_at_epa.gov
  • EPA http//161.80.70.167
  • Outside EPA http//130.11.44.140

77
Part 1 VoiceXML
  • Demonstration of the EPA VoiceXML application.
  • Brand Niemann, US EPA
  • VoiceXML Using the Tellme Studio and Tellme
    Networks Infrastructure
  • Art Clarke of Tellme Networks.
  • VoiceXML for Digital Talking Books
  • Janina Sajka and Katie Haritos-Shea of the
    American Foundation for the Blind and Art Clarke.
  • XML for the Las Vegas Blue Pages Pilot
  • Simon Chung and Craig Brown of NextPage.
  • XML to VoiceXML for the Las Vegas Blue Pages
    Pilot
  • Art Clarke and Simon Chung.
Write a Comment
User Comments (0)
About PowerShow.com