Title: PeertoPeer P2P XML Web Services for Digital Libraries
1Peer-to-Peer (P2P) XML Web Services for Digital
Libraries
- Tutorial for the
- Nevada Library Association Conference
- Reinventing the Roots of Tradition
- October 3 and 4, 2002, 1230 noon -2 p.m.
- Brand Niemann, Ph.D.
- US Environmental Protection Agency (MC 2822T)
- Washington, DC 20460
- niemann.brand_at_epa.gov
- 202-566-1657
2Abstract
- This tutorial will consist of two separate, but
related, 1.5 hour presentations (including hands
on steps) focused on a collection of sites from
the Internet organized by topics as follows - Part 1-Creating individual electronic card
catalogs of Web resources using low-cost,
easy-to-learn software that supports XML Web
Services on the Internet. - Part 2-Integrating individual electronic card
catalogs of Web resources and the actual content
distributed across the Internet using
Peer-to-Peer XML Web Services.
3Overview
- 1. Introduction
- 1.1 Digital Libraries (excerpts from booklet).
- 1.2 XML Web Services (in 7 points).
- 1.3 Peer-to-Peer (the basics).
4Overview (continued)
- 2. Part 1 (creating card catalogs) (today)
- 2.1 Software installation.
- 2.2 Building a simple database of library Web
resources. - 2.3 Export to XML and bind to an HTML Web page.
- 2.4 Share database on the Web.
- 2.5 More advanced applications.
5Overview (continued)
- 3. Part 2 (integrating card catalogs) (tomorrow)
- 3.1 Review of Part 1 (repeat key steps and add
more records). - 3.2 Content Networks (tour of a distributed
digital library). - 3.3 Building a NLA Digital Library Content
Network (brainstorming session). - 3.4 Questions and Answers (your turn).
61.1 Digital Libraries
- Digital Libraries Universal Access to Human
Knowledge, February 2001, 16 pp., Report to the
President from the Presidents Information
Technology Advisory Committee Panel on Digital
Libraries - Digital libraries can and should be an essential
resource for human learning and development in
the new century. Digital libraries should provide
universal access. - Provide Federal funding to make all public
Federal content persistently available in digital
form on the Internet.
71.2 XML Web Services
- The simple answer
- eXtensible Markup Language
- The more detailed answer
- a meta language
- text-based and easy to read
- ideal for structured documents
- presentation neutral
- multilingual
- helps integration of business
- open
- See http//www.softwareag.com/tamino/xml_reasons.h
tm
81.2 XML Web Services
- XML is a meta language.
- XML can define and describe any kind of
information (e.g. documents, databases, graphics,
etc.). - XML is text-based and easy to read.
- Documents can be read by applications and humans
(plain ASCII or UNICODE text).
91.2 XML Web Services
- XML is ideal for structured documents.
- XML documents are hierarchically structured
(document elements can be nested to build complex
information structures). - XML is presentation neutral.
- XML separates document content from presentation
(documents can be formatted for output on a
variety of display or other devices).
10Parts of a Well-Formed XML Document
- XML
Declaration - Comment
- White Space
- href"Inventory01.css"? Processing Instruction
- End of Prolog
- White Space
-
-
- The Adventures of Huckleberry
Finn - Mark Twain
- mass market paperback
- 298
- 5.49
-
- - Document Element (Root Element)
- -
-
- The Turn of the Screw
- Henry James
11A Simple Example Searching for Information
- Most services are invoked by inputting data into
HTML forms and sending the data to the service,
embedded within a URL string to match the given
text strings to catalogued HTML pages - http//www.google.com/search?qSkatebootsbtnGGo
ogleSearch - XML is a better way to send the data
-
- ice
- Skate
- boots
- size 7.5
-
121.2 XML Web Services
- XML is multilingual.
- XML is based on Unicode (represent characters is
almost all the worlds languages). - XML helps integration of business.
- XML lowers the barriers of worldwide e-business
networks (simpler and cheaper that current
Electronic Data Interchange solutions). - XML is open.
- Standards are supported by all the major vendors
(increased interoperability).
131.2 XML Web Services
- Recommended Resource
- Introduction to XML Video
- http//www.synthbank.com/xmlvideo.htm
- Cost 49.95 plus shipping and handling.
- See my Unit 2 for notes and slides at
http//130.11.44.140
141.3 Peer-to-Peer
- P2P computing has been around in one form or
another for almost thirty years. - P2P is the direct connection of any two computers
over the Internet without the use of another
server as middleman to manage the interaction. - It seriously challenges the old client/server
paradigm and promises to undermine the rule of
todays Internet and enterprise client/server
networks. Every computer is a client and a server
(equivalent capabilities and responsibilities). - Probably the most important implementation of P2P
technology is file sharing (can you say
Napster!). - P2P enables online collaboration (online gaming,
the Writable Web, etc.).
151.3 Peer-to-Peer
- Recommended Resources
- Discovering P2P, Michael Miller, Sybex Inc.,
2002, 462 pp. - OReilly P2P and Web Services Conference, 2001,
http//conferences.oreilly.com/p2p/ - 2001 P2P Networking Overview, http//www.oreilly.c
om/catalog/p2presearch/ - P2P, Harnessing the Power of Disruptive
Technologies, 2001, http//www.oreilly.com/catalog
/peertopeer/
162.1 Software installation
- Go to http//www.filemaker.com/downloads/index.h
tml - Click FileMaker Trial Software.
- Click FileMaker Pro 6 Trial Software.
- Complete the registration form.
- Unzip the file and complete the installation.
- Launch FileMaker Pro 6
17Options Use Templates or not
18Template Information
19Photo Catalog Template
- About FileMaker Pro templates
- These sample files are provided to help you get
results quickly with FileMaker Pro. You can use
the files immediately by creating a new record.
Or, you can customize the files by adding or
changing fields or layouts. Click the buttons at
the top to view layouts and reports. - Description
- Use this database file to keep track of your
rolls of film or home video and movies. - How to use this template
- Add your own collection to this database file.
Number your rolls of film and videos and include
a brief description of what is on the tape or
video. - Tips for customizing this template
- You can coordinate this database with the Photo
Album database by adding a Film ID field in the
Photo Album database to identify the roll of
film. - In the same way, you can add a Film ID field to
the Video Library database to add your own home
videos to the library. - Printing a report
- Click 'View Library Report'. Choose File menu
Print. After printing, click the Continue button.
20Other Templates of Interest
- Film Library
- Inventory
- Music Database
- Photo Catalog
- Product Catalog
21Create an Empty New File
222.2 Building a simple database of library Web
resources
232.2 Building a simple database of library Web
resources
- Enter
- Category (Text)
- Title (Text)
- Date (Date)
- URL (Text)
- Comments (Text)
- Action (Text)
242.2 Building a simple database of library Web
resources
- Categories (Susan Graf)
- Grant Funders for Literacy
- Health Literacy for Adults with Limited English
Proficiency - Citizenship
- Family Literacy
252.2 Building a simple database of library Web
resources
262.2 Building a simple database of library Web
resources
272.2 Building a simple database of library Web
resources
282.2 Building a simple database of library Web
resources
292.2 Building a simple database of library Web
resources
302.3 Export to XML and bindto an HTML Web page
312.3 Export to XML and bindto an HTML Web page
322.3 Export to XML and bindto an HTML Web page
332.3 Export to XML and bindto an HTML Web page
342.3 Export to XML and bindto an HTML Web page
352.3 Export to XML and bindto an HTML Web page
- The best of both worlds
- Store data using XML.
- Display and work with the information using HTML.
- Two main steps (simple with many variations)
- Link an XML document to an HTML table.
- Bind standard HTML elements (SPANs or TABLEs) to
individual XML elements or attributes. - Works best with an XML document that is
symmetrical like a typical database. Otherwise
use scripting techniques.
362.3 Export to XML and bindto an HTML Web page
http// 130.11.44.140/tri99table1.htm do View
Source
372.4 Share database on the Web
382.4 Share database on the Web
Port 591
392.4 Share database on the Web
See FMP 6 Web Publishing Security Guidelines!
402.4 Share database on the Web
http//xxx.xxx.xxx591
http//localhost591
412.4 Share database on the Web
422.4 Share database on the Web
432.5 More advanced applications
- 2.5.1 EPA Local Emergency Planning Committee
Database - 2.5.2 USGS Photo Library
- 2.5.3 Digital Talking Books (see booklet with
CD-ROM in back)
442.5.1 EPA Local Emergency Planning Committee
Database-Web
- Enter your zip code to retrieve environmental
information about your community - Local Emergency Planning Committee (LEPCs)
provide a forum for emergency management
agencies, responders, industry and the public to
work together to understand chemical hazards in
the community, develop emergency plans in case of
an accidental release, and always look for ways
to prevent chemical accidents. Local industries
must provide information to LEPCs about chemical
hazards, LEPCs are required by law to make this
information available to any citizen who requests
it. You can make a difference by attending an
LEPC meeting or joining your LEPC. - Please Note Currently we have over 3000 listings
in our LEPC Database. It is our goal to provide
the most current and accurate information. We
look to the LEPC community to help us
successfully meet this goal. Please forward any
changes or corrections to Dana Robinson. These
changes will be incorporated and updated monthly.
452.5.1 EPA Local Emergency Planning Committee
Database-Web
http//www.epa.gov/ceppo/lepclist.htm
462.5.1 EPA Local Emergency Planning Committee
Database-XML
http//130.11.53.73/lepc/FMPro?-dbLEPC.FP5-forma
t-fmp_xmlzip_lepczip_code22181-find
472.5.1 EPA Local Emergency Planning Committee
Database-VoiceXML
http//130.11.53.73/brand.vxml
48VoiceXML Development Tools
http//studio.tellme.com/
49EPA VoiceXML Application
- Welcome to the E. P. A. Local Emergency Planning
Committee finder. - Please speak or touch-tone your 5 digit Zipcode.
- 84040
- Here are results for the Zipcode 84040.
- The L. E. P. C. nearest to you is listed in the
E. P. A. database as follows. Davis County. At
Davis County Sheriffs Department located in the
city of Farmington. - Thank You for calling, goodbye.
502.5.2 USGS Photo Library Interface
512.5.2 USGS Photo Library Single Record
522.5.2 USGS Photo Library Multiple Records
532.5.2 USGS Photo Library Search Records
542.5.3 Digital Talking Books
- American governments must communicate with all of
us. To reach Americas large, diverse population,
all government must stay at the forefront of
communication technology. Standards from the
alphabet to XML increase the efficiency and
effectiveness of information transfer. This year,
using a proposed new standard, the American
Foundation for the Blind and TimeWarner Talking
Books released an audio e-Book on CD, an excerpt
of which can be downloaded (http//www.afb.org/tal
king_books.asp). You will see the familiar words
as text on screen or in Braille, synchronized
with the narrators voice. You can navigate
forward and backward in the speech using computer
keystrokes. We have moved from standardizing the
alphabet to standardizing book formats. - Extending Digital Dividends Public Goods and
Services that Work for All, September 2001, GSA
Office of Governmentwide Policy, 36 pp. - http//www.gsa.gov/attachments/GSA_PUBLICATIONS/ex
tpub/11-STurnbull_1.htm
552.5.3 Digital Talking Books
- Also called DAISY or NISO Books for the DAISY
(Digital Audio-based Information SYstem)
Consortium and National Information Standards
Organization. - Well-organized collections of computer files
produced according to specifications published by
DIASY and NISO - Medium-independent information access based on
open standards (W3Cs XML and SMIL) - Synchronized Multimedia Integration Language.
- Three Principal Types of Players
- Computers, personal digital assists (e.g.
BrailleNote), and specialized stand-alone
hardware players (Victor by VisuAide and Plextalk
by Plextor). Also Victor Trekker A GPS for the
blind. - American Foundation for the Blind, Special Issue
in AccessWorld - http//www.afb.org/aw/AW0203toc.asp
http//www.loc.gov/nis/niso, http//www.daisy.org
562.5.3 Digital Talking Books
http//www.visuaide.com/victorpro.html
572.5.3 Digital Talking Books
http//130.11.44.140/afb/Daisy2-VXML/index.html
58Install CD-ROM LP Player
59See LpPlayer Documentation
60See and Listen to the Digital Talking Book
613.1 Review of Part 1
- Repeat key steps and add more records
- Open FileMaker and open your database.
- Do Records, New Record
- Search Google, etc. and add two new records.
- Do File, Export Records, XML.
- Do Edit, Preferences, Application, Plug-Ins, Web
Companion, Configure, Port 591. - Do File, Sharing, Multi-User, Web Companion.
- Launch browser and Open localhost591.
- New! Exchange IP addresses with one another (this
is the start of P2P!).
623.2 Content Networks
- NextPage NXT 3 P2P Platform
- Esther Dysons Release 1.0, 1/22/2002
- NextPage is unique in the content-management
market in its distributed approach - NextPages platform, NXT 3, virtually connects
the distributed information sources and makes
them appear integrated to the user. Unlike
syndication, in which content is copied and
integrated with other content locally, NextPage
keeps objects where they are. - NextPage uses the standard simple object access
protocol (SOAP) to exchange and normalize
information between local content directories,
assembling meta-indexes so that users can search
or manipulate content transparently, regardless
of physical location.
633.2 Content Networks
http//www.sdi.gov http//fedgov.nextpage.com/defa
ult.htm
643.2 Content Networks
- Tour of a distributed digital library
- Please select the Java Tab for easier navigation.
- We have the NXT 3 software platform installed on
several Web servers where the content originates
and is maintained so that it can be made to look
and function as though it is only on one server
by XML Web Services. - We have to tell you which content is on different
servers because there is no way telling by just
looking at the interface.
653.2 Content Networks
- Tour of a distributed digital library
(continued) - It is generally said that content is 90
unstructured and 10 structured (databases) and
that XML (eXtensible Markup Language) is the
solution to bringing structure to unstructured
content to produce a number of significant
benefits. - Those benefits can be demonstrated when good
content is repurposed to make it more structured
and functional with XML.
663.2 Content Networks
- Tour of a distributed digital library
(continued) - The first example is the Statistical Abstract of
the US where 40 Acrobat and 1500 Excel files have
been converted to an XML content collection that
is highly structured, accessible, and searchable. - The second example is the CIA Country Profiles
that have been extensively markup with XML so
that custom search queries can produce sortable
data tables even when no data tables exist in the
original document.
673.2 Content Networks
- Tour of a distributed digital library
(continued) - Structured content (relational databases) can be
readily converted to XML in real-time using the
NXT 3 database adapters and presented as both
raw or styled XML as shown in the examples on
the site. Links between databases can be made as
is demonstrated in the USA Counties databases
linking to the same county in the Bear Facts
database.
683.2 Content Networks
- Tour of a distributed digital library
(continued) - Recall that digital libraries need to provide
content persistently available in digital form on
the Internet. NXT 3 does this by an intelligent
Web Services agent that will crawl, index in XML,
and archive the contents on entire Web sites. - 8.5 years of the Chesapeake Journal Newspaper
online has been preserved by NXT 3 so it can be
searched separately or jointly along with any or
all other content nodes, including other remote
Web sites!
693.2 Content Networks
- Tour of a distributed digital library
(continued) - Local files on the Web server in their native
(proprietary formats) can be indexed in XML and
searched separately or jointly along with any or
all other content nodes. - Major collections of content on other servers can
be made to look as though they are centralized on
one server as is the case with Environmental Web
Services (see the Digital Library of the State of
the Environment). - Major collections of content can be built/hosted
on one server and then moved to another server as
in the case with Housing and Urban Development
(HUD) Node.
703.2 Content Networks
- Tour of a distributed digital library
(continued) - The NXT 3 is being evaluated for its ability to
create an uber portal or portal over portals by
using it to index on a regular schedule several
on the major portals in the Federal government. - The Federal Blue Pages Pilot is an examples of
how NXT 3 could be used to deliver and update
distributed content that changes frequently
(phone numbers across government agencies) and
that needs to be disseminated on the telephone
using VoiceXML as well as the Web.
713.2 Content Networks
- Tour of a distributed digital library
(continued) - Finally, the NextPage NXT 3 Documentation is
maintained by NextPage on their own server, but
looks as though is an integral part of this
portal server. - Distributed content networks can also be feed and
maintained by content providers just uploading
their content through a Web browser without their
needing to have a full-fledged Web server
themselves. This NXT 3 feature is called Managed
Content (with a Web browser).
723.2 Content Networks
- Tour of a distributed digital library
(continued) - Custom query forms using XML have also been
developed to provide more customize or
personalized access to the individual content
nodes for both databases and structured
documents. - Finally links to more information about NextPage
End-to-End Solutions have been provided (see next
slide).
733.2 Content Networks
http//www.nextpage.com
743.3 Building a NLA Digital Library Content
Network
- Brainstorming session
- Would you like to learn more about XML Web
Services? - Would you like to author or repurpose some
content in XML? - Would you like to do a Digital Talking Book?
- Would you like to learn more about FileMaker?
- Would you like to learn more about NextPage NXT 3?
753.3 Building a NLA Digital Library Content
Network
- FedWeb 2002 Fall, October 28-29, 2002, George
Mason University, 3401 North Fairfax Drive,
Arlington, VA (Virginia Square Metro) - Turning Web Sites into Web Services Solutions
for Government - October 28th Tutorial
- Hands On Training XML Part I and II (Westlake)
- October 29th Program
- Web Business Management (W3C Web Services
Activity, etc.) - Technical Foundations
- Emerging Technologies and Trends (XML, XML Web
Services, E-Forms and PKI, etc.) - Content Management (XML)
- See http//www.fedweb.org
763.3 Building a NLA Digital Library Content
Network
- Recommended Resources
- XML Step by Step, Second Edition, Michael J.
Young, Microsoft Press, 2002, 488 pp. - CD-ROM with 69 source files.
- Microsoft-centric (Internet Explorer 5)
- See my Unit 3 for notes and use of XML Spy 4
- XML By Example, Second Edition, Benoit Marchal,
Que, 2002, 495 pp. - Source files downloadable from Web site.
- Lots of interestingly and useful examples.
- See my Unit 13 notes and slides.
773.3 Building a NLA Digital Library Content
Network
- American Foundation for the Blind, Special Issue
in AccessWorld - http//www.afb.org/aw/AW0203toc.asp
783.3 Building a NLA Digital Library Content
Network
793.3 Building a NLA Digital Library Content
Network
http//www.nextpage.com/document.asp?sectionServi
cespathServices/education/online20training/arch
ive.xml
803.4 Questions and Answers
- Your turn to ask me some really hard questions.
- Thank you for your kind attention and goodbye for
now!