XML Web Services: Electronic Records and Document Management System ERDMS - PowerPoint PPT Presentation

1 / 70
About This Presentation
Title:

XML Web Services: Electronic Records and Document Management System ERDMS

Description:

... are to preserve and provide access to any kind of electronic record, free from ... Adobe PDF files into different-sized devices, such as eBook reading devices. ... – PowerPoint PPT presentation

Number of Views:260
Avg rating:3.0/5.0
Slides: 71
Provided by: Niem
Category:

less

Transcript and Presenter's Notes

Title: XML Web Services: Electronic Records and Document Management System ERDMS


1
XML Web Services Electronic Records and Document
Management System (ERDMS)
  • Brand Niemann
  • (from XML Web Services Evangelist to Solutions
    Architect)
  • Office of Environmental Information
  • US EPA
  • July 1, 2002

2
Definition of Records
  • The United States government defines records as
    any books, papers, maps, photographs, machine
    readable materials, or other documentary
    materials, regardless of physical form or
    characteristics, made or received by an agency of
    the United States Government under Federal law or
    in connection with the transaction of public
    business and preserved or appropriate for
    preservation by that agency or its legitimate
    successor as evidence of the organization,
    functions, policies, decisions, procedures,
    operations, or other activities of the Government
    or because of the informational value of data in
    them. Library and museum material made or
    acquired and preserved solely for reference or
    exhibition purposes, extra copies of documents
    preserved only for convenience of reference, and
    stocks of publications and of processed documents
    are not included.
  • Source Section 3301 of title 44, United States
    Code, Definition of Records

3
Overview
  • 1. EPA ERDMS Strategy
  • 2. Multiple Standards and Requirements
  • 3. Authoring and Conversion of Documents
  • 4. Work Flow and Content Networking
  • 5. XForms
  • 6. Recommendations
  • 7. Contact Information

4
1. EPA ERDMS Strategy
  • June 2002
  • The current islands of information need to be
    replaced by a single, searchable information
    repository accessible to all employees with
    collaboration and workflow tools.
  • The core Electronic Records and Document
    Management System (ERDMS) would be composed of
    both a Records Management Application (RMA) and a
    Document Management System (DMS).
  • The agencys standard desktop business
    applications would feed into the ERDMS.
  • The legacy DMS would feed into the RMA.
  • The legacy information systems would feed into
    the ERDMS.
  • See next slide for the ERDMS Target Architecture.

5
1. EPA ERDMS Strategy
Core ERDMS
RMA
DMS
Word Processing
E-mail
HR, Finance, Etc.
Legacy DMS
Desktop Applications
Legacy Information Systems
Integrated Legacy DMSs
ERDMS Target Architecture
6
1. EPA ERDMS Strategy
  • Implementation
  • Phase 1. Initial System-Installation of the ERDMS
    so documents will be saved to the repository
    instead of local hard drives.
  • Phase 2. Advanced System-Information systems
    outside the ERDMS will be linked with the ERDMS.
  • In general, all Agency information systems will
    either integrate (link to the legacy system) or
    migrate (copy documents and metadata from a
    legacy system and discontinue use).
  • Website content will be archived within the ERDMS.

7
2. Multiple Standards and Requirements
  • 2.1 NARA and GAO
  • 2.2 CIO XML Working Group
  • 2.2.1 Comments
  • 2.2.2 Records Management Metadata
  • 2.2.3 XML Web Services
  • 2.3 Accessibility Guidelines
  • 2.4 E-Gov and FirstGov
  • 2.5 Interoperability with EPA-State and Other
    Networks

8
2.1 NARA and GAO
  • Preservation and Migration of Electronic Records
    The State of the Issue, Kenneth Thibodeau, June
    27, 2002
  • The quantity of records is increasing
    exponentially and the types of information
    objects are also changing.
  • The experience of archives is largely limited to
    relatively simple technical formats (e.g. flat
    files).
  • Strictly speaking, it is not possible to preserve
    electronic records it is only possible to
    maintain the ability to reproduce electronic
    records (e.g. music score).
  • An archival system should built in such a way
    that it is possible to replace any component of
    the hardware or software with minimal impact on
    the system and with no impact on its contents
    (called future proofing in the XML world).
  • A record is recorded information produced or
    received in an institutional or individual
    activity and that comprises content, context and
    structure sufficient to provide evidence of the
    activity regardless of the form or medium.
    Extended to include the appearance of the record
    and a unique feature of electronic records,
    hyperlinks.

9
2.1 NARA and GAO
  • Preservation and Migration of Electronic Records
    The State of the Issue, Kenneth Thibodeau, June
    27, 2002
  • Persistent Object Preservation expresses the
    structure of records using eXtensible Markup
    Language (XML) Document Type Definitions (DTD)
    which is the content model (metadata) for
    individual records to enter archival fonds.
  • Multi-Valent Documents (MVD) technology that
    captures and maintains a bitmapped image of the
    document and the eXtensible Stylesheet Language
    (XSL) in the XML standard are being considered
    for preserving the appearance of the record.
  • Demonstrations of these methods involves
    re-purposing a variety of collections of records
    (c.f. my XML Registry and Repository and Digital
    Library of the State of the Environment).

10
2.1 NARA and GAO
Source XML Spy White paper Document Frameworks
Unifying XML Content Management and Database
Systems for the Internet.
11
2.1 NARA and GAO
  • GAO Report on Challenges in Managing and
    Preserving Electronic Records, June 2002, page
    24
  • NARAs long-term strategic initiative is to
    develop an advanced electronic records archive.
    The agencys goals for this system are to
    preserve and provide access to any kind of
    electronic record, free from dependency on any
    specific hardware or software, so that the agency
    can carry out its mission into the future.
  • Although the new archival system is not yet
    formally defined, agency documents, public
    presentations, and interviews with agency
    officials and staff indicate, in broad outline,
    how they envision this system. It will probably
    be a distributed system, allowing the storage and
    management of massive record collections at a
    variety of installations, with accessibility
    provided via the Internet. It may be based on
    persistent object preservation, an advanced form
    of file format conversion and encapsulation
    (described in Appendix II) that is the subject of
    research sponsored by NARA and other
    organizations. A leading candidate for performing
    this encapsulation and capturing the necessary
    information is the Extensible Markup Language
    (XML), which provides a means for tagging
    (annotating) information in a meaningful fashion
    that can be readily interpreted by disparate
    computer systems (XML is further discussed in
    Appendix II).

12
2.1 NARA and GAO
  • GAO Report on Challenges in Managing and
    Preserving Electronic Records, June 2002, page
    30
  • The importance of enterprise architecture
    development, implementation, and maintenance is a
    basic tenet of effective IT management. Used in
    concert with other IT management controls, an
    enterprise architecture can greatly increase the
    chances for optimal mission performance. We have
    found that attempting to modernize operations and
    systems without an enterprise architecture leads
    to operational and systems duplication, lack of
    integration, and unnecessary expense.
  • Over the past several years, NARA has taken
    action to develop an enterprise architecture.
    NARA has drafted a current architecture and is
    working on a target architecture, but this work
    is incomplete. However, the process to develop
    the electronic archival system is well under way.
    Without an enterprise architecture to guide its
    development, NARA increases the risk that the
    planned electronic archival system will be
    incompatible with existing and future operations
    and systems, thus wasting resources and requiring
    that unnecessary interfaces be built to achieve
    integration.

13
2.2.1 Comments
  • Owen Ambur, Co-chair XML Working Group, FedWeb
    Conference, September 2001
  • All business-quality records would be created and
    managed throughout their full life-cycles in
    Web-based, XML-enabled, DoD-certified E-records
    management systems.
  • DoD Standard 5015.2 is not perfect or complete,
    but it is a good basic set of requirements.
    XML-related enhancements to the standard will be
    considered by the XML Working Group.
  • The key to making FirstGov more effective is for
    agencies to use Web-based, XML-enabled,
    DoD-certified electronic records management
    systems to manage and embed the appropriate
    metatags in all of their records.

14
2.2.2 Records Management Metadata
  • An XML Schema for Electronic Records
    Management,DRAFT, Version 0.1, by Leila Naslavsky
    and Dorrit Gordon
  • To be used as the basis for a records management
    application (RMA) compliant with the United
    States Department of Defenses (DoD) standards
    for RMAs, DoD 5015.2. Figure 1. Record Life Cycle
    (see next slide).
  • http//www.cse.ucsc.edu/dgordon/ERM/ERMSchemaPape
    r.html
  • XML Schema in XML Spy 4.4 (see slide after next
    slide)
  • record.xsd
  • dispositioninstruction.xsd
  • file.xsd
  • RMA.xsd
  • RMAexample.xsd

15
(No Transcript)
16
2.2.2 Records Management Metadata
17
2.2.3 XML Web Services
  • Why XML?
  • The eXtensible Markup Language became a World
    Wide Web Consortium (W3C) standard in 1998 as the
    universal format for structured documents and
    data on the Web (http//www.w3.org/XML/).
  • The CIO Council created the XML Working Group in
    2000 to facilitate the efficient and effective
    use of XML through cooperative efforts among
    government agencies, including partnerships with
    commercial and industrial organizations
    (http//xml.gov/).
  • GAO report to Congress urges government to adopt
    XML and that Federal Agencies address XML in
    their enterprise architectures (http//www.gao.gov
    /new.items/d02327.pdf).
  • XML Web Services is what OMBs Mark Forman is
    encouraging in the E-Gov Initiatives and
    especially for the collect once, use many
    knowledge management projects like the Geospatial
    Information One-Stop (http//egov.gov).

18
2.2.3 XML Web Services
  • What is XML?
  • XML is a standard for preserving and
    communicating information encoding, tagging,
    and internationalizing that will be everywhere.
  • Web Services provide communication between
    applications running on different Web servers
    that will bring the Internet to its new level.
  • XML Web Services are applications running on
    different devices that communicate XML data using
    XML messages.
  • XML Web Services for geospatial data use the
    OpenGIS Consortiums GML (Geography Markup
    Language) and OWS (Open Web Services) standards
    and specifications.
  • Web Services can and should be interoperable
    across multiple vendor tools and platforms in the
    enterprise (see http//www.ws-i.org/Community.aspx
    ).

19
2.2.3 XML Web Services
20
2.2.3 XML Web Services The Web Services
Standards stack
  • Commonly used by the major vendors
  • Work Flow (WFDL-Work Flow Description Language).
  • Publication and Discovery (UDDI-Universal
    Description, Discovery, and Integration).
  • Service Description (WSDL-Web Services
    Description Language).
  • Messaging (XMLP-XML Protocol from SOAP-Simple
    Object Access Protocol).
  • Content (XML-Extensible Markup Language).
  • Transport (HTTP-Hypertext Transport Protocol).

21
2.3 Accessibility Guidelines
  • We applaud the efforts that Adobe has made to
    embrace XML technologies that provide open
    source, non-proprietary formats. We call on Adobe
    and other developers to commit to accessible XML
    practices, as defined by the XML Accessibility
    Guidelines (XAG) currently in public draft.
  • Janina Sajka, Director, Technology Research and
    Development, American Foundation for the Blind,
    and Joe Roeder, Senior Access Technology
    Specialist, National Industries for the Blind,
    PDF and Public Documents A White Paper, Version
    1.1, published April 25, 2002.
  • http//www.afb.org/AboutPDF.asp

22
2.4 E-Gov and FirstGov
  • Portfolios
  • Government to Citizen (G2C) (5)
  • E.g., Recreation One-Stop
  • Government to Business (G2B) (5)
  • E.g., Business Compliance One-Stop
  • Government to Government (G2G) (5)
  • E.g., Geospatial Information One-Stop
  • Internal Effectiveness and Efficiency (IEE) (8)
  • E.g., E-Records Management
  • Cross-cutting
  • e-Authentication
  • Infrastructure
  • Federal Enterprise Architecture
  • Source http//egov.gov/egovreport-3.htm
  • Contributing to these efforts.

23
2.4 E-Gov and FirstGov
  • FirstGov Whats Coming in September
  • Content Management System
  • Does your agency want to partner (need buy-in)?
  • Some mandatory functionality like tools for
    Section 508 compliance, a visual tool to layout
    or model a Web site, documented Application
    Programming Interfaces, security services that
    meet applicable guidelines, Web services, loosely
    coupled connectors to help facilitate repurposing
    content, among others.
  • Questionnarie (see next slide).

24
2.4 E-Gov and FirstGov
  • FirstGov Content Management Survey
  • General Questions (XML 1 of 12)
  • 9. Intended use of XML How important is it for
    you to employ XML in the acquisition, management,
    and/or delivery of content?
  • Author Questions (XML 0 of 20)
  • Advanced Questions (XML 4 of 22)
  • 1. Need for XML tools How important is it for
    you for the system to have native XML processing
    tools and functions built in?

25
2.4 E-Gov and FirstGov
  • FirstGov Content Management Survey (continued)
  • Advanced Questions (XML 4 of 22)
  • 2. Need for XML Standards Support How important
    is it for you to support XML-based standards such
    as RSS, ICE, ebXML, and the Web Services family
    (e.g. SOAP).
  • 3. Existing XML Usage Have you already
    developed DTDs or Schemas to validate your XML
    content?
  • 4. Current Usage of XML Stylesheets Have you
    already developed XSL Stylesheets to transforms
    your XML documents?

26
2.5 Interoperability with EPA-State Other
Networks
27
2.5 Interoperability with EPA-State Other
Networks
28
2.5 Interoperability with EPA-State Other
Networks A Content Node for Every EPA Office,
Program, Region, State, and Partner
29
2.5 Interoperability with EPA-State Other
Networks
Web Browser
Mobile/Wireless Devices
IONIC Portrayal Engine (GML-to-SVG, etc.)
IONIC Web Services Framework (Proprietary-to-GML
Converters)
LandView/ Cameo CD/DVDs
State of PA PASDA
USGS
EPA Region 3 CBP
EPA WME
30
3. Authoring and Conversion of Documents
  • 3.1 Document Framework Design Advanced XML
    Application Development (XML Spy)
  • 3.2 WebDAV
  • 3.3 Office XP
  • 3.4 Adobe FrameMaker 6.0
  • 3.5 Adobe Acrobat 5.0
  • 3.6 Open Format for Office Documents

31
3.1 Document Framework Design Advanced XML
Application Development
  • Four Steps
  • Schema Modeling An iterative process which
    involves initial requirements analysis,
    use-cases, as well as examination of existing
    data schemas. Additional refinements are required
    to map all of the elements of your XML Schema to
    the underlying database (relational or XML-based)
    or content management system.
  • Data Flow and Process Modeling The flow of
    information gathered by a document framework must
    be modeled from content author (non-technical
    domain experts), transported to the database, and
    then to content consumers (typically customers,
    partners, etc.).

32
3.1 Document Framework Design Advanced XML
Application Development
  • Four Steps (continued)
  • Transformation Modeling XSLT has a two-fold
    critical role in both the input templates that
    are used by content creators and the output
    stylesheets that are required by the content
    consumers and must be designed to fit the data
    flow and process model determined earlier.
  • Implementation The business logic and user
    interface of a document framework application
    must be custom developed, but can be easily
    implemented using any of the leading Internet
    application development platforms (e.g., J2EE,
    Microsoft .NET web Services, Oracle Application
    Server, etc.)

33
3.1 Document Framework Design Advanced XML
Application Development
34
3.2 WebDAVhttp//www.webdav.org
  • Web-based Distributed Authoring and Versioning
  • It is a standardized set of extensions to the
    HTTP protocol - the core of the World Wide Web -
    which allows users to collaboratively edit and
    manage files on remote web-servers.
  • Using WebDAV, content authors have distributed
    access to virtually any underlying database or
    content management system.
  • See Jim Whitehead, University of California,
    Santa Cruz WebDAV Remote Collaborative
    Authoring and Electronic Records Management, see
    XML.Gov Presentations, November 14, 2001.

35
3.2 WebDAVWeb-based Distributed Authoring
Versioning with XML Spy
36
3.3 Office XP
  • Whats New
  • Closer integration with the Web. Each application
    in Office can save files in HTML format,
    streamlining integration with the Internet and
    corporate intranets. Excel and Access are also
    able to read and save in XML format.
  • Excel works with more data types, including
    common data sources on the Web. XML is now
    supported as a data interchange format, and
    worksheets can be linked directly to XML data on
    the Web. The new RTD (Real-Time Data) function
    brings real-time data into Excel for analysis.
  • Access now works with more data types, including
    common data sources on the Web. XML data can be
    either created from Microsoft Access format (Jet)
    or SQL Server structures and data, or can be used
    to import data or structure into either Access or
    SQL Server.
  • Discovering Microsoft Office XP Standard and
    Professional, Version 2002.

37
3.3 Office XP
  • Word
  • The three basic ways of producing XML output from
    Word for multiple uses (data exchange over the
    Internet, archiving with NARA, etc.) are
  • 1. Word Save as XML (e.g. a Visual Basic add-in).
  • 2. A special version of Word like the Wall Street
    Journal uses to produce XML for content
    syndication.
  • 3. A tool like XML Spy Integrated Development
    Suite that automates the conversion of
    Word-to-XML and provides other XML functionality
    needed to do serious XML work.
  • Note See XML Spy White paper Document
    Frameworks Unifying XML Content Management and
    Database Systems for the Internet.

38
3.3 Office XP
  • Word
  • Menu Convert, Import Microsoft Word Document,
    Select a Word document, and Open.
  • Do View, Text view and View, Browser view.
  • This command enables the direct import of any
    Word document and conversion into XML format, if
    you have been using paragraph styles in Microsoft
    Word. This option requires Microsoft Word or
    Microsoft Office (Version 97 or 2000). When you
    select this command, the Open dialog box appears.
    Select the Word document you want to import.
  • XML Spy automatically generates an XML document
    with included CSS stylesheet. Each Word paragraph
    generates an XML element, whose name is defined
    as the name of the corresponding paragraph style
    in Microsoft Word.

39
3.4 Adobe FrameMaker 6.0
  • When it shipped in 1998, Adobe FrameMakerSGML
    5.5 was one of the first publishing tools to
    support XML.
  • Standard General Markup Language (SGML) is a
    descriptive markup language that is the precursor
    to XML.
  • FrameMakerSGML 6.0 now allows the creation of
    content for true interactive documents and
    manuals. FrameMaker 6.0 comes in two versions
  • Standard FrameMaker, which is used by many
    companies for publishing large documents using
    multiple mediums (such as print, Web, and
    CD-ROM).
  • FrameMakerSGML, which adds SGML and advanced XML
    capabilities to FrameMaker software. It also
    supports CSS (Cascading Style Sheets) and XSL
    (eXtensible Stylesheet Language) for display of
    content.

40
3.4 Adobe FrameMaker 6.0
41
3.5 Adobe Acrobat 5.0
  • Repurposing and Extracting
  • Acrobat 5.0 gives you powerful commands for
    repurposing or extracting text and graphics in
    PDF files.You can use the Save As command to save
    all text in a PDF file in Rich Text Format (RTF)
    for import into your favorite authoring
    application. If your PDF files use tagged Adobe
    PDF, you can extract the text without losing the
    formatting. For example, you can save pages of
    tables from a PDF file for import into an
    application such as Adobe FrameMaker or Microsoft
    Word and the table formatting will be preserved.
    Both PDFMaker and Acrobat Web Capture create
    tagged Adobe PDF automatically. (See About the
    different types of Adobe PDF documents on next
    slide) You can also use the Save As command to
    save each page in a PDF file to an image format.
    You can use the Export command to export all
    images in a PDF file each image is saved in a
    separate file. In addition, Acrobat provides
    several toolsthe text select tool, the column
    select tool, the table/formatted text select
    tool, and the graphics select toolfor copying
    and pasting small amounts of text and graphics
    from a PDF file to your clipboard.You can also
    paste text from a PDF document into a comment or
    bookmark name. While in a PDF document, you
    select the text or graphic and copy it onto the
    clipboard. Once the text or graphic is on the
    clipboard, you can launch the other application
    and paste the text or graphic into a file.

42
3.5 Adobe Acrobat 5.0
  • About the different types of Adobe PDF documents
  • There are three types of Adobe PDF documents
    unstructured, structured, and tagged. These
    document types differ in what they contain and
    how their contents can be repurposed. In general,
    the more structural information the Adobe PDF
    document contains, the more options you have for
    repurposing its contents.
  • 1. Unstructured Adobe PDF You can save
    unstructured Adobe PDF files to other formats
    such as RTF with good results. An unstructured
    Adobe PDF file saved to RTF recognizes
    paragraphs, but not basic text formatting, lists,
    or tables. You cant reflow unstructured Adobe
    PDF files into different-sized devices, such as
    eBook reading devices. Unstructured Adobe PDF
    files arent reliably accessible using a screen
    reader for Windows.
  • 2. Structured Adobe PDF You can save structured
    Adobe PDF files to other formats such as RTF with
    results that are better than unstructured Adobe
    PDF files but not as good as tagged Adobe PDF
    files. Structured Adobe PDF files saved to RTF
    recognize paragraphs and basic text formatting,
    but not lists or tables. You cant reflow
    structured Adobe PDF files into different-sized
    devices. Structured Adobe PDF files can be
    accessed using a screen reader for Windows, but
    without the reliability of tagged Adobe PDF
    files.
  • 3. Tagged Adobe PDF You can save tagged Adobe
    PDF files to other formats such as RTF with the
    best results, including the recognition of
    paragraphs, basic text formatting, lists, and
    tables. You can reflow tagged Adobe PDF files so
    that theyre readable in different-sized devices
    Tagged Adobe PDF files have been optimized for
    accessibility, so they can be accessed reliably
    using a screen reader for Windows.

43
3.5 Adobe Acrobat 5.0
  • See Acrobat 5.0 Help
  • Repurposing Adobe PDF Documents (pages 82-90) and
    Working with PDF (pages 103-107)
  • Creating tagged Adobe PDF documents (need to do
    for accessibility anyway).
  • Saving Adobe PDF documents to other formats (RTF
    and XML). See next slides.
  • Viewing Document Metadata, Help, page 192.
  • But still need XML authoring tools and expertise
  • I have done this for lots of EPA documents in my
    XML Web Services training.
  • Need industrial-strength XML tools and software
    platforms for efficient cost-effective
    electronic document management solutions (c.f.)
  • eXtensible Markup Language (XML) Web Services for
    Legacy Document Collections, Brand Niemann and
    David Eng, April 5, 2002, to appear in InfoAccess.

44
3.5 Adobe Acrobat 5.0
45
3.5 Adobe Acrobat 5.0
  • Repurposing PDF to XML
  • Adobe PDF Document as HTML
  • http//access.adobe.com/simple_form.html
  • Save As XML Plug-In for Windows (B2)
  • http//www.adobe.com/support/downloads/detail.jsp?
    hexID89a2
  • Install and do Help and About Adobe Acrobat
    Plugins and select SaveasXML.
  • Do File, Save as, XML-1.00 without styling
    (.xml) or XHTML-1.00 with CSS-1.00 (.htm).
    (Note Must be a tagged Acrobat PDF.)
  • See SaveAsXML Developer Information for Creating
    and Modifying Mapping Tables (DeveloperInfo.pdf).

46
3.5 Adobe Acrobat 5.0
47
3.5 Adobe Acrobat 5.0(Re-purposing of Superfund
PDF files with XML)
48
3.5 Adobe Acrobat 5.0(eXtensible Index Language
for large PDF collections)
49
3.5 Adobe Acrobat 5.0(XIL hit list from search
of large PDF collections)
50
3.5 Adobe Acrobat 5.0
  • Document Metadata
  • Viewing Document Metadata In Acrobat 5.0, Adobe
    PDF files contain Document Metadata in XML
    format. This Document Metadata contains (but is
    not limited to) information that is also in the
    Document Properties. Any changes made in the
    Acrobat Document Properties dialog box are
    reflected in the Document Metadata. Because
    Document Metadata is in XML format, it can be
    extended and modified using third-party products.
    You can copy and paste the Document Metadata XML
    source code.
  • To view the Document Metadata
  • 1 Choose File, Document Properties, Document
    Metadata.
  • 2 The Document Metadata dialog box displays all
    the metadata embedded in the document. (Metadata
    is displayed by schemathat is, in predefined
    groups of related information.) The information
    associated with each schema is visible by
    default it can be hidden by clicking the
    triangle next to the schema name. If a schema
    doesnt have a recognized name, it is listed as
    Unknown. The XML name space is contained in
    parentheses after the schema name.
  • 3 To view the XML code, click View Source. You
    can cut, copy, and paste XML code from the
    Metadata Source View dialog box. Click OK to
    return to the Document Metadata dialog box.
  • 4 Click OK to close the Document Metadata dialog
    box, and click Cancel to close the dialog box
    without making any changes.
  • See next slides.

51
3.5 Adobe Acrobat 5.0
52
3.5 Adobe Acrobat 5.0
53
3.5 Adobe Acrobat 5.0
  • ltrdfRDF xmlnsrdf'http//www.w3.org/1999/02/22-r
    df-syntax-ns'
  • xmlnsiX'http//ns.adobe.com/iX/1.0/'gt
  • ltrdfDescription about''
  • xmlns'http//ns.adobe.com/pdf/1.3/'
  • xmlnspdf'http//ns.adobe.com/pdf/1.3/'gt
  • ltpdfModDategt2001-07-30T173238-0600lt/pdfModDat
    egt
  • ltpdfCreationDategt2001-07-30T173204-0600lt/pdfC
    reationDategt
  • ltpdfProducergtAcrobat Distiller 4.05 for
    Windowslt/pdfProducergt
  • lt/rdfDescriptiongt
  • ltrdfDescription about''
  • xmlns'http//ns.adobe.com/xap/1.0/'
  • xmlnsxap'http//ns.adobe.com/xap/1.0/'gt
  • ltxapModifyDategt2001-07-30T173238-0600lt/xapMod
    ifyDategt
  • ltxapCreateDategt2001-07-30T173204-0600lt/xapCre
    ateDategt
  • lt/rdfDescriptiongt
  • lt/rdfRDFgt

54
3.6 Open Format for Office Documents
  • Corels Vision
  • Proprietary content is rapidly becoming the
    greatest barrier to managing, reusing,
    repurposing, and dynamically generating content.
    Enterprise knowledge workers, including designers
    and developers, interact with a variety of
    desktop content creation products, each designed
    for a single purpose, and each outputting a
    specific file type. Companies quickly amass
    collections of multiple file types.
  • Proprietary files trap corporate knowledge and
    intellectual property in content islands,
    resulting in increased costs and reduced
    productivity.
  • XML can also be seamlessly integrated into any
    workflow or any technology infrastructure.

55
3.6 Open Format for Office Documents
  • OpenOffice.orgs Mission
  • Create an open and ubiquitous XML-based file
    format for office documents and provide an open
    reference implementation for this format.
  • Core Requirements
  • Must be capable of being used as an office
    programs native file format and support the full
    capability of a StarOffice/OpenOffice document.
  • Structured content should use XMLs structuring
    capabilities and be represented in terms of XML
    elements and attributes.
  • The file format must be fully documented and have
    no secret features.
  • OpenOffice must be the reference implementation
    for this file format.

56
4. Work Flow and Content Networking
  • The Challenge (June 5, 2002)
  • Include all the requirements at the EPA ERDM
    Project Webpage (access restricted to EPA only
    with password).
  • Use several of the general records schedules
    (plus those suggested by the team).
  • Assume you want to create two sets of OIC/OEI
    documents
  • The first are simply documents that have to be
    retained because they are FOI-able (retrieved
    from a repository).
  • The second are documents that must be retained
    for a period of time and then sent to NARA.

57
4. Work Flow and Content Networking
  • The Challenge (June 5, 2002) (continued)
  • The workflow is
  • You create, your supervisor edits and approves,
    their supervisor edits and approves, and their
    supervisor approves and it ends the document
    process.
  • The document is then sent to all the regions, the
    AAs office, and several state officials and then
    to NARA in XML format.
  • In summary create the documents and demonstrate
    the workflow, accessibility at different levels,
    scheduling, distribution, and archiving. Also do
    the same thing with PDF files while retaining
    their appearance.
  • I am convinced that XML is the only viable
    choice to transfer future documents.

58
4. Work Flow and Content Networking
  • Team efforts to date
  • At least 6 months of experience working with XML
    on Superfund documents (see Training Units 18 and
    others).
  • Discussed the challenge and offered suggestions
    and support.
  • NextPage visited Region 6 (Steve Wyman) to learn
    more about EPA requirements.
  • Interpretation of The Challenge (as modified on
    June 25th - All I was interested in was the
    softwares that convert the documents and their
    metadata from their proprietary formats to XML)
  • Not required to demonstrate the workflow (use
    NextPages slides and WebEx files to explain).
  • Focus on the XML aspects (see Sections 3, 4, and
    5).

59
4. Work Flow and Content NetworkingNextPage NXT
CM Workflow
  • Automatically routes data to ensure defined
    practices and approvals are followed.
  • Graphical interface and stored templates simplify
    set-up.
  • Due dates, history routing tracking.
  • Triggers automate critical processes.
  • Between each step, there is an opportunity to use
    triggers.
  • Triggers are able to execute external scripts,
    programs, perform formatting and translations, or
    call CM functions.

60
4. Work Flow and Content NetworkingNextPage NXT
CM Web Interface
Web-based UI
Work on multiple projects at a time
Know what to do and when!
Assign work. Provide Feedback
61
4. Work Flow and Content NetworkingNextPage NXT
3 Enterprise Content Networking an end-to-end
solution
62
4. Work Flow and Content NetworkingNextPage NXT
3 Distributed Search and Navigation
63
4. Work Flow and Content NetworkingNextPage NXT
3 Content Management
  • A Content AND Document Management Engine.
  • Secure and flexible workflow environment.
  • XML metadata support.
  • Optional Interfaces to MS Word, Arbortext EPIC,
    SoftQuad XMetaL, HyperVision WorX and other
    structured and unstructured editorial
    applications.
  • Manages XML Documents and subdocument components,
    multimedia objects and binary files.
  • Manages the relationships between content,
    metadata and collections of information objects.
  • Extensive standards based API.

64
4. Work Flow and Content Networking(XIL of the
EPA History Web Site on a regular schedule)
Recall Slide 6. Website content will be archived
within the ERDMS.
65
5. XForms
  • FedForms.Gov (http//www.fedforms.gov/searchresult
    s.cfm)
  • Eleven forms for EPA (two require Adobe Acrobat
    Reader).
  • XML Spy Document Editor Tutorial
  • XML-based e-forms development tools.
  • eXtensible Forms Description Language (XFDL)
  • Part of the W3Cs HTML Activity.
  • Vendor Activity
  • Adobe Re-tooling PDF Forms (Approval, Accelio
    Corp. acquisition, and support for online
    transactions).
  • PureEdge is a leading implementor (Air Force
    awards contract to convert 14,000 electronic
    forms into XML).
  • Blue Oxides XML Design Tool (Beta) (similar to
    XML Spy or Turbo XML) for creation of
    data-oriented documents.

66
5. XForms
  • XML Spy Document Editor Tutorial
  • The aim of this tutorial is to fill in the
    OrgChart template supplied with XML Spy IDE.
  • This will be achieved by
  • Entering data into the predefined tables.
  • Adding additional persons to the department
    table.
  • Adding a new company and filling in all the
    relevant data.
  • Prerequisites
  • The OrgChart template necessary for this tutorial
    is supplied with XML Spy IDE.
  • Any other templates you want to edit must have
    been created using XSLT Designer and saved there
    (thus creating a .sps file).
  • There is also a Datasheet Template.

67
5. XForms
  • W3C XForms 1.0 (January 18, 2002)
  • http//www.w3.org/TR/xforms/
  • Forms were introduced into HTML in 1993. Since
    then they have become a critical part of the Web.
    The existing mechanisms in HTML for forms are now
    outdated, and the W3C started work on developing
    an effective replacement. This document defines
    "XForms", W3C's name for the next generation of
    web forms.
  • XForms is an XML application that represents the
    next generation of Forms for the Web. By
    splitting traditional XHTML forms into three
    parts - data model, instance data, and user
    interface - it separates presentation from
    content, allows reuse, gives strong typing -
    reducing the number of round-trips to the server,
    as well as offering device independence and a
    reduced need for scripting. XForms is not a
    free-standing document type, but is intended to
    be integrated into other markup languages, such
    as XHTML.

68
5. XForms
  • Introductions (Micah Dubinko, Cardiff Software,
    Inc.)
  • What are XForms (January 16, 2002)
  • http//xml.com/pub/a/2001/09/05/xforms.html
  • Interactive Web Services with XForms (January 16,
    2002)
  • http//xml.com/pub/a/2001/09/26/xforms.html
  • W3C The Next generation of Web Forms
  • http//www.w3.org/MarkUp/Forms/
  • http//www.w3.org/2000/04/xforms-testimonial
  • Some Demos
  • Mozquito Technologies
  • http//www.mozquito.org/html/lang-english/xforms.h
    tml
  • X-Smiles
  • http//www.xsmiles.org/

69
6. Recommendations
  • Expand the ERDMS Strategy to include the
    additional standards and requirements and
    integration into the EPA Enterprise Architecture
    efforts that contemplate the addition of an XML
    Web Services Application Layer.
  • Implement an XML-based Document Framework for the
    Agency, including the need to implement the new
    XForms standards for agency forms and
    transactions.
  • Follow my lead with implementing XML Web Services
    standards and multiple requirements in my pilot
    projects with EPA content and in several e-Gov
    initiatives that involves re-purposing a variety
    of records collections.

70
7. Contact Information
  • Brand Niemann, Ph.D.
  • USEPA Headquarters, EPA West, Room 6143D
  • Office of Environmental Information, MC 2822T
  • 1200 Pennsylvania Avenue, NW, Washington, DC
    20460
  • 202-566-1657
  • niemann.brand_at_epa.gov
  • EPA http//161.80.70.167
  • Outside EPA http//130.11.44.140
Write a Comment
User Comments (0)
About PowerShow.com