Title: Standards For Hybrid Libraries: Web Standards
1Standards For Hybrid LibrariesWeb Standards
- Brian Kelly UK Web Focus
- UKOLN
- University of Bath
- Bath, BA2 7AY
B.Kelly_at_ukoln.ac.uk http//www.ukoln.ac.uk/
UKOLN is funded by the Library and Information
Commission, the Joint Information Systems
Committee (JISC) of the Higher Education Funding
Councils, as well as by project funding from the
JISC and the European Union. UKOLN also
receives support from the University of Bath
where it is based.
2Contents
- Introduction
- Background To The Web Architecture
- Addressing
- Data Format
- Transfer
- Metadata
- Conclusions
-
3Standardisation
23950 PNG HTML Java?
- Community
- Library groups
- Cultural Heritage
- Government
- Proprietary
- De facto standards
- Often initially appealing (cf GIF, PowerPoint,
PDF) - May emerge as standards
- Formal
- Formal international/ national standards
processes - ISO, CEN, NISO, ECMA, ANSI, BSI
- Can be slow-moving and bureaucratic
- Produce robust standards
- W3C
- Produces W3C Recommendations
- Managed approach
- Protocols initially developed by W3C members
- Decisions made by W3C, influenced by member
public review
RelevantBodies
- IETF
- Produces Internet Drafts on Internet protocols
- Bottom-up approach to developments
- Protocols developed by interested individuals
- "Rough consensus and working code"
HTTP URNwhois
PNG HTML HTTP
4Background to the Web
- The web was initially very successful due to its
simplicity
Server
Client
Give me foo.htmlfrom www.bath.ac.uk
CERNApache IIS
MosiacNetscape IE
HTML
Here it is
The web is based on three key architectural
components Data Format HTML (HyperText Markup
Language) Addressing URLs (Uniform Resource
Locators) Transport HTTP (Hypertext Transfer
Protocol)
URLs
HTML
HTTP
5Problems With the Web
- Although the web has been successful, there are
problems - Performance - the web is too slow
- Resource discovery - lack of a metadata
architecture - HTMLs lack of arbitrary structure
- Accessibility - difficulties of accessing
information by visually impaired, people using
PDAs, etc. - Functionality - difficult to deploy new
applications on the web - Addressing
- etc.
6Solutions (Today)
- HTML 4.0 used in conjunction with CSS 2.0
(Cascading Style Sheets) and the DOM provides an
architecturally pure, yet functionally rich
environment
- HTML 4.0 - W3C-Rec
- Improved forms
- Hooks for stylesheets
- Hooks for scripting languages
- Table enhancements
- Better printing
- CSS 2.0 - W3C-Rec
- Support for all HTML formatting
- Positioning of HTML elements
- Multiple media support
- DOM - W3C-Rec
- Document Object Model
- Hooks for scripting languages
- Permits changes to HTML CSS properties and
content
- Problems
- Changes during CSS development
- Netscape IE incompatibilities
- Continued use of browsers with known bugs
7HTML's Limitations
- HTML 4.0 / CSS 2.0 have limitations
- Difficulties in introducing new elements
- Time-consuming standardisation process (ltABBREVgt)
- Dictated by browser vendor (ltBLINKgt, ltMARQUEEgt)
- Area may be inappropriate for standarisation
- Covers specialist area (maths, music, ...)
- Application-specific (ltSTUD-NUMgt)
- HTML is a display (output) format
- HTML's lack of arbitrary structure limits
functionality - Find all memos copied to John Smith
- How many unique tracks on Jackson Browne CDs
8XML
- XML
- Extensible Markup Language
- A lightweight SGML designed for network use
- Addresses HTML's lack of evolvability
- Arbitrary elements can be defined
(ltSTUDENT-NUMBERgt, ltPART-NOgt, etc) - Agreement achieved quickly - XML 1.0 became W3C
Recommendation in Feb 1998 - Forms the basis of B2B applications
- Support from industry (SGML vendors, Microsoft,
etc.) - Support in Netscape 5 and IE 5
9XML Deployment
- Ariadne issue 15 has article on "What Is XML?"
- Describes how XML support can be provided
- Natively by new browsers
- Back end conversion of XML - HTML
- Client-side conversion of XML - HTML / CSS
- Java rendering of XML
- Examples of intermediaries
See http//www.ariadne.ac.uk/issue15/what-is/
10XHTML
- XHTML
- an XML representation of HTML
- Issues
- Documents must be well-formed
- Tags in lowercase
- Quote attributes ltimg src"foo" height"10"
- ltligtEnd tags requiredlt/ligt
- Empty elements ltimg src"foo" / gt ltbr / gt
- Tidy utility see lthttp//www.w3.org/People/Ragge
tt/tidy/gt - See ltURL http//www.w3.org/TR/WD-html-in-xml/gt
Question Is it time to produce XHTML documents?
11Namespaces and Linking
- XML Namespaces
- What if an XML document contains a ltTITLEgt for
the document and a ltTITLEgt for the name of a
book? - XML Namespaces enable such clashes to be resolved
- The naming conventions are defined at a URL
- XSL stylesheet language will provide
extensibility and transformation facilities (e.g.
create a table of contents or create metadata
from structured data) - XLink and XPointer should provide richer
hyperlinking mechanisms in the future
12Addressing (Problems)
- URLs (e.g. http//www.bris-poly.ac.uk/depts/music
/) have limitations - Lack of long-term persistency
- Organisation changes name
- Department shut down or merged
- Directory structure reorganised
- Inability to support multiple versions of
resources (mirroring) - ISBN/ISSN also problematic
- Not tied to the work
- Nor to the item at hand
13Addressing (Solutions)
- PURLs (Persistent URLs)
- Provide single level of redirection
- DOIs (Document Object Identifiers)
- Proposed by publishing industry as a solution
- Aimed at supporting rights ownership
- Business model needed
- Do two copies of a digital object get separate
DOIs?
14Transport
- HTTP/0.9 and HTTP/1.0
- Design flaws and implementation problems
- HTTP/1.1
- Addresses some of these problems
- 60 server support
- Performance benefits! (60 packet traffic
reduction) - Is acting as fire-fighter
- Not sufficiently flexible or extensible
- HTTP/NG
- Radical redesign using object-oriented
technologies - Undergoing trials
- Gradual transition (using proxies)
- Integration of application (distributed
searching?)
15Metadata
- Metadata - the missing architectural component
from the initial implementation of the web
- Metadata Needs
- Resource discovery
- Content filtering
- Authentication
- Improved navigation
- Multiple format support
- Rights management
16RDF
RDF Data Model
- RDF - the metadata framework
- Based on a formal data model (direct label
graphs) - Syntax for interchange of data
- Schema model
17Conclusions
- To conclude
- Standards are important, especially for national
initiatives and other large-scale services - Proprietary solutions are often tempting because
- They are available
- They are often well-marketed and well-supported
- They may become standardised
- Solutions based on standards may not be properly
supported by applications - Metadata and structured data formats are big
growth areas - Deployment of new standards is an important
question