Title: XML Web Services: Toxics Release Inventory
1XML Web ServicesToxics Release Inventory
- Brand Niemann
- XML Web Services Evangelist
- Data Standards Branch
- January 12, 2002
Disclaimer Any reference to or depiction of the
commercial product of any vendor is for
illustrative purposes only and does not
constitute an endorsement by EPA or the trainer.
2Overview
- 1. Background
- 2. National Database to FileMaker XML
- 3. Web Pages and PDF to XML Documents
- 4. Data Tables to XML Data Islands
- 5. Some Future Steps
- 6. Questions and Answers
31. Background
- The Toxics Release Inventory (TRI), published by
the U.S. EPA, is a valuable source of information
regarding toxic chemicals that are being used,
manufactured, treated, transported, or released
into the environment. - Two statutes, Section 313 of the Emergency
Planning and Community Right-To-Know Act (EPCRA)
and section 6607 of the Pollution Prevention Act
(PPA), mandate that a publicly accessible toxic
chemical database be developed and maintained by
US EPA. This database, known as the Toxics
Release Inventory (TRI), contains information
concerning waste management activities and the
release of toxic chemicals by facilities that
manufacture, process, or otherwise use said
materials. Using this information, citizens,
businesses, and governments can work together to
protect the quality of their land, air and water.
42. National Database to FileMaker XML
- 2.1 FileMaker 5.5
- http//www.filemaker.com
- 2.2 Steps
- Download National.exe (16.7 MB) and extract.
- http//epa.gov/tri/tri99/data/
- Import each of 4 files into FileMaker 5.5 (164
MB). - Make the 4 files sharable on the Web.
- Use the FileMaker URL syntax for XML output.
- 2.3 Interface Customization Possibilities.
- http//www.filemaker.com/products/fmu_home.html
52.1 FileMaker 5.5
- Subsidiary of Apple Computer with powerful
desktop desktop database functionality that
supports multiple platforms including the Web. - The workgroup database of choice with
organizations more than 65 of the 1.2 million
units shipped in 2000-2001 were volume license
sales - second to Microsoft Access. - Third party developer resources
- Macromedia Dreamweaver
- Adobe GoLive
- Allaire ColdFusion
62.1 FileMaker 5.5 Database-to-XML
72.2 1999 Toxics Release Inventory (TRI) Data Files
- File Type 1 Facility, Chemical, Releases and
Other Waste Management Summary Information. This
file contains facility information (Part I on
Form R and Form A) as well as most chemical
information (Part II on Form R and Form A). Data
elements are reported individually. The
information is also disaggregated based on Waste
Management code (i.e., "M" code), and aggregated
up to On-site Releases, Off-site Releases, Other
On-site Waste Management, and Transfers Off-site
for Further Waste Management categories. (84,079
records) - File Type 2 Detailed Waste Management and Source
Reduction Activities. This files contains
facility information (Part I on Form R and Form
A) as well as the detailed information regarding
source reduction and recycling activities (Part
II, Section 8 on Form R) and on-site waste
treatment methods (Part II, Section 7 on Form R).
(84,079 records) - File Type 3A Details of Transfers Off-site. This
file contains facility information (Part I on
Form R and Form A) as well as details of
individual transfers off-site (Part II, Section
6.2 on Form R). (100,033 records) - File Type 3B Details of Transfers to Publicly
Owned Treatment Works (POTW). This file contains
facility information (Part I on Form R and Form
A) as well as a list of POTWs (Part II, Section
6.1.B on Form R). (84,079 records)
82.2 TRI National File Type 1 in FileMaker 5.5
92.2 TRI National File Type 1 in Web Browser
102.2 TRI National File Type 1 in Web Browser
112.2 TRI National File Type 1 in IE 6 (XML)
122.3 Interface Customization Possibilities
- Change default.htm to own.
- Use own stylesheet (XSL). Need Developer version.
- Use HTML and Java to build Web application or
portal. - Local Emergency Planning Committee database
- http//www.epa.gov/ceppo/lepclist.htm
- List of Lists database
- http//130.11.53.73/lol/
- Population Estimation from Year 2000 Census
Blocks - http//198.246.85.108591/population
133. PDF and Web Pages to XML Documents
- 3.1 Content Re-design and Re-publishing.
- 3.2 Repurposing PDF to Excel.
- 3.3 Repurposing PDF to XML.
- 3.4 Repurposing PDF to Folio Views.
- 3.5 NextPage Folio Views, LivePublish, and NXT 3.
- 3.6 Comments.
143.1 Content Re-design and Re-publishing
- Background
- backgrd_factors.pdf
- Database
- National.exe
- Press
- 40 pdf files at http//epa.gov/tri/tri99/press/pre
ss.htm - Tri99press.xsl (34 tables)
- Previous
- Tri97.nfo and tri97.xls
- Questions and Answers
- Qa.pdf (file error)
- Report
- 1999pdr.pdf, completereport.pdf,
sfs_introduction.pdf (Tri99.xsl - 23 tables).
153.1 Content Re-design and Re-publishing
163.2 Repurposing PDF to Excel
- See Adobe Acrobat Help pages 103-109 82-84
- See next two slides for background.
- Do Edit, Preferences, Text/Formatted Text
Preferences, Default Selection Type Table, Okay. - Select Table/Formatted Text Select Tool and draw
a box around the table to be converted. - Do Edit, Copy (or CtrlC)
- In a blank Excel worksheet do Edit, Paste
(Ctrl-V) - Results tri99.xls and tri99press.xls.
17Acobat 5.0 Repurposing and Extracting
- Acrobat 5.0 gives you powerful commands for
repurposing or extracting text and graphics in
PDF files.You can use the Save As command to save
all text in a PDF file in Rich Text Format (RTF)
for import into your favorite authoring
application. If your PDF files use tagged Adobe
PDF, you can extract the text without losing the
formatting. For example, you can save pages of
tables from a PDF file for import into an
application such as Adobe FrameMaker or Microsoft
Word and the table formatting will be preserved.
Both PDFMaker and Acrobat Web Capture create
tagged Adobe PDF automatically. (See About the
different types of Adobe PDF documents on next
slide) You can also use the Save As command to
save each page in a PDF file to an image format.
You can use the Export command to export all
images in a PDF file each image is saved in a
separate file. In addition, Acrobat provides
several toolsthe text select tool, the column
select tool, the table/formatted text select
tool, and the graphics select toolfor copying
and pasting small amounts of text and graphics
from a PDF file to your clipboard.You can also
paste text from a PDF document into a comment or
bookmark name. While in a PDF document, you
select the text or graphic and copy it onto the
clipboard. Once the text or graphic is on the
clipboard, you can launch the other application
and paste the text or graphic into a file.
18About the different types of Adobe PDF documents
- There are three types of Adobe PDF documents
unstructured, structured, and tagged. These
document types differ in what they contain and
how their contents can be repurposed. In general,
the more structural information the Adobe PDF
document contains, the more options you have for
repurposing its contents. - 1. Unstructured Adobe PDF You can save
unstructured Adobe PDF files to other formats
such as RTF with good results. An unstructured
Adobe PDF file saved to RTF recognizes
paragraphs, but not basic text formatting, lists,
or tables.You cant reflow unstructured Adobe PDF
files into different-sized devices, such as eBook
reading devices. Unstructured Adobe PDF files
arent reliably accessible using a screen reader
for Windows. - 2. Structured Adobe PDF You can save structured
Adobe PDF files to other formats such as RTF with
results that are better than unstructured Adobe
PDF files but not as good as tagged Adobe PDF
files. Structured Adobe PDF files saved to RTF
recognize paragraphs and basic text formatting,
but not lists or tables.You cant reflow
structured Adobe PDF files into different-sized
devices. Structured Adobe PDF files can be
accessed using a screen reader for Windows, but
without the reliability of tagged Adobe PDF
files. - 3. Tagged Adobe PDF You can save tagged Adobe
PDF files to other formats such as RTF with the
best results, including the recognition of
paragraphs, basic text formatting, lists, and
tables.You can reflow tagged Adobe PDF files so
that theyre readable in different-sized
devices.Tagged Adobe PDF files have been
optimized for accessibility, so they can be
accessed reliably using a screen reader for
Windows.
193.2 Repurposing PDF to Excel
203.2 Repurposing PDF to Excel
213.3 Repurposing PDF to XML
- Adobe PDF Document as HTML
- http//access.adobe.com/simple_form.html
- Save As XML Plug-In for Windows (B2)
- http//www.adobe.com/support/downloads/detail.jsp?
hexID89a2 - Install and do Help and About Adobe Acrobat
Plugins and select SaveasXML. - Do File, Save as, XML-1.00 without styling
(.xml) or XHTML-1.00 with CSS-1.00 (.htm).
(Note Must be a tagged Acrobat PDF.) - See SaveAsXML Developer Information for Creating
and Modifying Mapping Tables (DeveloperInfo.pdf).
223.3 Repurposing PDF to XML
233.3 Repurposing PDF to XML
243.3 Repurposing PDF to XML
253.3 Repurposing PDF to XML
263.4 Repurposing PDF to Folio Views
- Imports major word processing and Web formats.
- Use Adobe Acrobat 5.0.5.
- Not the free Acrobat Reader.
- Do File, Open as Adobe PDF, then File, Save as,
RTF. - Use Folio View 4.2
- Do File, New and give it a name, Open or File,
Import, select RTF, Open. - Also do File, Import URL for Web formats.
- Apply structure, links, formatting, etc. using
the GUI.
273.4 Repurposing PDF to Folio Views
283.4 Repurposing PDF to Folio Views
293.5 NextPage Folio Views, LivePublish, and NXT 3
- NextPage http//www.nextpage.com
- Folio Views SGML-like markup (pre-XML) in a
GUI. - CD-ROM distribution.
- Web Server (Markup-to-HTML on the fly).
- LivePublish Basic XML support (uses DTD and see
next slide). - Site Administrator.
- Personal Edition (Desktop and CD-ROM).
- Web Server (Markup-to-HTML on the fly).
- NXT 3 Advanced support for XML (LivePublish
plus XSL, SOAP, etc. see later slide). - Content Network Manager.
- Content Network Server.
303.5 NextPage LivePublish
- Uses of XML (see separate handout)
- Serve up native XML.
- Convert XML to HTML using a CSS or XSL at run
time using the Display Filter API. - Convert XML to HTML at build time.
- Uses an XML-based file to define site look and
feel. - The build Makefiles are XML files that define the
structure and contents of the information
collections. - XML-based legacy conversion tools simplify the
conversion of existing content into HTML. - Indexsheets (XIL) define and control the indexing
of content like stylesheets (XSL) define and
control the formatting (see separate handout).
313.5 NextPage Folio Views
323.5 NextPage LivePublish Site Administrator
333.5 NextPage LivePublish Personal Edition
343.5 NextPage LivePublish Personal Edition
353.5 NextPage LivePublish Web Server
363.5 NextPage NXT 3 Content Network
- NextPage Web Services White Paper
- NXT 3 has been delivering XML Web Services since
July 2000 based on an early SOAP recommendations
before SOAP became a standard. - NextPage is developing full support for SOAP,
WSDL, and UDDI standards and conforming Web
service frameworks such as .Net and Sun One
(Java). - Basic XML Web services provides low-level
communication and NXT 3 provides high-level data
coordination when intelligent evaluation of
distributed content and collaborative
capabilities in the context of business processes
is needed (just released Matrix).
373.5 NextPage NXT 3 Content Network Manager
383.5 NextPage NXT 3 Content Network Web Server
393.6 Comments
- Previous work
- 1995 Folio Views Infobase and Excel files.
- TRI 1997 CD-ROM Users Guide Infobase.
- Could add Year 2000 easily to Year 1999.
- Organized files by folders for indexing with the
NXT 3 File Service (recall section 3.1 screen
capture and see next slide). - Can/should create tagged PDF files when you use
Acrobat PDFMaker 5.0 to create PDF files from
within Microsoft Office 2000 applications.
403.6 Discussion
414. Excel Data Tables to XML Data Islands
- 4.1 Excel-to-XML and XML-to-Excel Round-tripping.
- 4.2 XML Spy 4.2.
- 4.3 Application of XML Step by Step, Second
Edition, Data Binding. - 4.4 Comments.
424.1 Excel-to-HTML(XML) andHTML(XML)-to-Excel
Round-tripping
- In Excel do File, Save as Web Page, select
Republish Sheet, Publish, Open in Browser,
Publish. - In IE 5 or 6 do View Source and explore the
XML-like markup. - In Excel do File, Open, Files of type Web pages.
434.2 Data Tables to XML Data Islands
- XML Spy 4.2 (see Tutorial)
- Copying XML data to and from third party
products - XML Spy allows you to easily copy data to and
from third party products. The copied data can be
used within XML Spy as well as third-party
products, enabling you to transfer XML data to
spreadsheet-like applications (e.g. Microsoft
Excel). - The " Copy as Structured Text" command copies
elements to the clipboard as they appear on
screen. This command is useful for copying
table-like data from the Enhanced Grid View as
well as the integrated Database/Table View. - The copied data can be used within XML Spy as
well as third-party products, enabling you to
transfer XML data to spreadsheet-like
applications (e.g. Microsoft Excel).
444.3 Application of XML Step by Step, Second
Edition, Data Binding
- Re-format Excel worksheet with appropriate field
names (Upper Camel Case). (See next slide) - Import to FileMaker 5.5 using field names.
- Query FileMaker on the Web for XML output
- http//localhost/FMPro?dbtri99table1.fp5format
dso_xmlfindall - Add the XML output as a data island in the HTML
file and display in IE5-6. - See tri1999table1.xml and tri1999table1.htm
454.3 Application of XML Step by Step, Second
Edition, Data Binding
464.3 Application of XML Step by Step, Second
Edition, Data Binding
474.3 Application of XML Step by Step, Second
Edition, Data Binding
-
-
-
-
- Nevada
- 1
- 1529022
- 1868475
- 136431
- 2797
- 1.1647E09
- 1.1682E09
- 212998
- 1.1684E09
-
- ..
484.3 Application of XML Step by Step, Second
Edition, Data Binding
495. Some Future Steps
- 5.1 Microsoft Excel 2002 lets you open or save
workbooks in XML format. - 5.2 Access 2002 allows you to create a database
table by importing an XML document or to export a
database table or other object to an XML
document.
505.1 Microsoft Excel 2002
- Source Chapter 15. Publishing Information on the
Web, Step by Step Microsoft Excel 2002 - Previous Excel 2000 workbooks and worksheets
could be saved as Web files and queries could
bring Web data into workbooks. - Excel 2002 extends those capabilities by
providing live-links from Excel to Web files and
by providing import and export of XML and Smart
Tags (e.g. have Excel look for known stock
symbols and connect to a Web site that has
information related to that symbol).
515.1 Microsoft Excel 2002
- Working with Structured Data
- XML can identify rows and cells within the
spreadsheet and allow spreadsheet data to move
freely to other applications. - Do File, Save As, Save as type, select XML
Spreadsheet (.xml), and click Save. Click Yes
when the message box appears. - Open the XML file in Spy to examine its structure
and content (PivotXML.xml). - Open the XML file in Excel 2002 to see it
re-display.
525.1 Microsoft Excel 2002
535.2 Access 2002
- Source Chapter 3. Getting Information Into and
Out of a Database, Step by Step Microsoft Access
2002 - Best practices
- Link to other databases rather than import so can
view and edit in both systems. - Share databases by exporting to XML (universal
format). - http//office.microsoft.com/assistance/2002/articl
es/acExOfScenariosUsingXML.aspx - Import
- Open Access 2002 database.
- File, Get External Data, Import, Files of type,
XML Documents, Import both XML and XSD, select
file to be imported, Import, Import XML, Options,
Structure and Data, Okay. - Open and view database tables to confirm data was
imported.
545.2 Access 2002
- Exporting to other applications
- Works for Table, Query, Form, and Report.
- Open Access 2002 database and select a table.
- File, Export, select XML Documents, Save as type,
Export, Export XML, select both Data (XML) and
Schema (XSD) of the data, Okay. - See screen captures on next pages.
- See Advanced, Schema tab and select appropriate
option. - Look at XML and XSD files (see examples below) in
XML Spy 4.2 - Orders.xml, Order Details.xml, and Order
Details.xsd.
555.2 Access 2002
565.2 Access 2002
575.2 Access 2002
586. Questions and Answers
- Brand Niemann. Ph.D.
- USEPA Headquarters, EPA West, Room 6143D
- Office of Environmental Information, MC 2822T
- 1200 Pennsylvania Avenue, NW, Washington, DC
20460 - 202-566-1657
- niemann.brand_at_epa.gov
- EPA http//161.80.70.167
- Outside EPA http//130.11.44.140