Title: File Access Services, Directory Services,
1File Access Services, Directory Services, The
World Wide Web (WWW)
2File Access Services
- Definition of File Access Services
- The ability to transparently access information
stored in files on other systems across a network - File Access vs. File Transfer
- File Transfer Characteristics
- File transfer makes independent copies of files
on different hosts - Copies the entire file and access is generally
not available until the copy is completely
executed - File access methods privileges are not the same
for local and remote resources (so its not
transparent) - Allows for storage of intermediate results,
distributed management control, and data
protection
3File Access Services
- File Access Characteristics
- User access is only to a selected portion of a
file the complete master copy of the file is
stored on a single system - Images of file are not independent there is
only one real copy stored on a single system - File access methods and permissions are typically
the same for local and remote resources - Typically provides quick up-to-date access to
files and prevents different versions of a file
from getting out of synchronization - Multi-user access to files must be carefully
managed!!
4Examples of Major File Access Services
- The Networked File System (NFS)
- The goal of NFS is to provide transparent file
access for clients to files and filesystems on
NFS servers - From both a user a technical perspective uses a
client-server architecture - Developed by Sun Microsystems to allow
filesystems to be distributed across multiple
UNIX/Linux systems - Current specification (version 4) is in RFC 3530.
still a number of version 3 implementations (RFC
1813) - NFS clients available for Windows and Mac
Operating Systems - Theoretically any application on the client that
works with local file access should also work
with NFS file access - Technical details of NFS will be examined later
5Examples of Major File Access Services
- Microsoft File Services
- File access and sharing has been around as a
separate product for a decade (LANManager)
support for networked file access now an integral
part of Windows 95/98/NT and later versions - Originally proprietary but Microsoft has
transitioned to TCP/IP (actually UDP) and
published the interface specifications in RFC
1001 and 1002 - Major components of Microsoft file access
services - The Server Message Block API
- The NetBIOS API
- Browser service
- The Universal Naming Convention (UNC)
6Examples of Major File Access Services
- Microsoft File Services (Continued)
- The client typically maps a networked file or
filesystem by binding the resource (via the UNC)
to a local drive letter at that point access is
the same as a local resource - Also Resources can be mapped directly with the
UNC - Clients available for Windows, OS\2, Mac, and
many desktop UNIX/Linux systems (via SAMBA, etc.) - Microsoft has introduced a new component called
DFS (Distributed File Services) - Provides a unified virtual namespace that makes
distributed resources look like local resources - Provides unified access across multiple real
file services like SMB, Netware, and NFS - Still proprietary to Microsoft
7Examples of Major File Access Services
- Apple (Appleshare and the Appletalk Filing
Protocol) - Created as part of a proprietary network
architecture for Macs - Created for Apple File sharing client-server or
peer-to-peer - Current versions run on TCP/IP
- Novell Netware
- Created to support Novell File Servers Clients
- Proprietary client-server architecture though
Novell does do 3rd party licensing - Access to remote resources via drive mapping
where a remote resource is bound to a local drive
letter - Current versions run on TCP/IP
- Clients for DOS, Win3.1, Win95/98, NT, OS/2, and
Mac
8The Remote Procedure Call (RPC)
- The foundation of NFS The Remote Procedure Call
(RPC) - NFS is based on the Remote Procedure Call, a
generic API that allows procedures called on one
system to execute on another - If a client is RPC aware nothing needs to be
done by a user to access resources via NFS - There are two different RPC varieties Sun RPC
and the OSF Distributed Computing Environment
(DCE) - Sun RPC specifications published in RFC 1057
version 2 open specifications published in RFC
1831 (version 2) - Sun RPC, the foundation of most NFS
implementations, uses the Sockets API to access
TCP and UDP transport services - The RPC specifications consist of two main
pieces - The transfer protocol (message format exchange)
- Data representation encoding
9The Remote Procedure Call (RPC)
- General Diagram of RPC Operation
10The Remote Procedure Call (RPC)
- RPC Message format
- Two messages used a Request and a Reply
- Fields in an RPC Request
- Transaction ID (XID) field 4 bytes a field
initialized by the client w/ a unique number used
to match Replies w/Requests - Call field 4 bytes set to zero for a Request
one for a Reply - RPC version field 4 bytes specifies what
version of RPC is in use typically either two
(version 2) or three (for version 3) - Program numbers, version, and procedure fields 4
bytes each used by the server to determine
what specific procedure the client wants to
access remotely
11The Remote Procedure Call (RPC)
- RPC Message Format (Continued)
- Credentials field up to 408 bytes
- Used to identify the client to the server
- Use is optional typically the User ID and Group
ID of the client process is included - Can be used for the server to enforce access
restrictions on the RPC - Actual length of field encoded at the beginning
of the field value - Verifier field up to 408 bytes used with
secure RPC to pass encryption related information - Parameters field variable length holds all
parameter information necessary for the procedure
call to be executed by the server
12The Remote Procedure Call (RPC)
- RPC Data Representation
- RPC uses a special encoding system call External
Data Representation (XDR) for encoding the data
values in RPC fields - Allows heterogeneous systems to communicate using
a common data representation - Field structure including bit and byte order
specified for each field in the RPC messages - As an example, in XDR all integers are encoded as
four byte values - XDR specifications published in RFC 4506
13The Remote Procedure Call (RPC)
- The RPC Port Mapper (411 for RPC)
- RPC server programs use ephemeral ports instead
of well-known ports, so some kind of registrar
is necessary for client stub processes to
determine how to communicate with server
processes - The port mapper is an RPC server process all
other RPC server processes on that particular
server registers with when they initialize - The port mapper listens for client calls on
well-known port 111 (both UDP and TCP)
14The Technical Details of NFS
- General NFS Technical information
- NFS uses a subset of defined RPCs to provide
distributed file access (RPC can do much more
than that) - Files and filesystems are referenced through the
use of file handles objects or structures
used to represent the files or filesystems - NFS servers are stateless (version 3 and
earlier) servers do not keep track of which
clients are accessing its resources or which
resources are being accessed - Version 4 introduces file locking, which
introduces stateful operation and requires more
complex operational procedures
15The Technical Details of NFS
- An Example of NFS Operation
16The Technical Details of NFS
- NFS Procedures
- NFS Version 4 has 39 procedures for file
directory manipulation and access as well as an
extensible security framework - Built on RPCSEC_GSS (RPC 2203)
- Provides integrity, privacy, and authentication
- Allows for negotiation of security options
- NFS Version 3 has 20 procedures for
file/directory operations - Important NFS procedures (common to v3 and v4)
- OPEN CLOSE compound commands for file
manipulation - ACCESS check file access permissions
- READ WRITE reads and writes to/from files
options for either synchronous or asynchronous
writes - GETATTR SETATTR get and set attributes on a
file - MKDIR RMDIR make directory delete directory
17The Technical Details of NFS
- Mounting Drives
- In order for a client to access remote files via
NFS it must use the NFS mount procedure - The mount procedure returns a handle which the
client uses to reference the filesystem - The client then uses the handle to integrate the
remote filesystem into its local filesystem
structure (analogous to mapping a networked
drive) - As with other NFS procedures the port mapper must
be called first to find the UDP or TCP port being
used on the NFS server - Note this procedure works differently under
the hood in v4, as no portmapper is used and the
mount protocol has been incorporated into the
core NFS specification
18The Technical Details of NFS
- NFS over TCP
- Originally NFS was strictly a LAN technology
rarely implemented over wide area connections, so
UDP was used to provide efficient throughput - With the globalization of companies NFS is being
used more often over wide geographic areas - Using TCP provides NFS with more robust
implementations over WAN connections - The default transport for NFS version 4 is TCP
19The Technical Details of NFS
- An NFS Example Reading the file testplan.txt
from mountpoint/smiley/testplan.exe
20Directory Services
21Directory Services
- Definition
- A means of searching a database of information
easily rapidly using keywords to match
attributes stored in the database - With the explosion of networking and the Internet
very important for finding people, information,
services - Directory services must be
- Easy to use
- Contain correct data
- Provide value
- Directory services are categorized as White or
Yellow Pages - White Pages Directory Services oriented toward
information associated with individuals - Yellow Pages Directory Services oriented toward
finding resources (networked printers, servers,
etc) or services
22Directory Service Protocols
- There are 4 Major Directory Service Protocols
used in the Internet - DNS,
- NIS,
- X.500, and
- LDAP
- And yes, NDS, WINS, and AD could be included
too - The Domain Name Service
- Yes, this can be considered a very limited
Directory Service! - Can be adapted for other uses such as spam
blacklists - Provides only name resolution services but some
information can be gleaned from the name
structure - Has almost global accessibility
- Already covered earlier in exhaustive detail
23Directory Service Protocols
- The Network Information Service (NIS)
- NIS was developed by Sun Microsystems as a
companion to NFS to ease the burden on
administrators of distributed computing systems - Uses a client-server architecture for
distribution of Yellow Pages information - NIS allows the use of a single administrative
repository for important network-wide
information such as password files. - Enables the use of a single username password
across multiple systems - Also includes information search and retrieval
commands that can be used in the development of
non-administrative distributed directories and
databases - Sun is trying to migrate from NIS to the
LDAP-enabled Sun Directory Services (SDS)
theres a large installed base of users so this
will likely take a while
24Directory Service Protocols
- Technical Details of NIS
- Like NFS, NIS is built upon the Remote Procedure
Call (RPC) and NIS procedures operate very much
like NFS procedures - Built around map files, which are the data or
directory information repositories and maps,
which are unique views or indexes for the data
files - Map files are stored in individual directories on
the server (example the password file is
typically stored on a server at /var/yp/passwd - Servers come in two varieties master and slave
- The master server is the single administration
point for the map files analogous to a primary
DNS server - There is only one master NIS server for a set of
map files - Slaves read all their map file information from
the Master - Clients can be serviced equally well by either a
Master or a Slave
25Directory Service Protocols
- Technical Details of NIS (continued)
- Organizations that want multiple access policies
for directory or distributed database information
can segregate NIS servers into Domains - A domain defined as a set of servers that access
and distribute information from a unique set of
maps - Each domain has a single Master server and as
many slaves as necessary to adequately service
client requests - Domains can overlap (an NIS server could be a
Master for multiple domains) and clients can
easily switch between domains - NIS Database Operations come in two flavors
distributed filesystems and explicit commands - With the distributed filesystem mode of operation
NIS clients read data for administrative
functions from the NIS server instead of a local
file - This mode is used for such functions as password
or license files it allows a single file
administered from the Master to be used across
multiple NIS client systems
26Directory Service Protocols
- Technical Details of NIS (continued)
- NIS clients can also be set up so they will check
their local file first and if a match is not
found do an NIS lookup - NIS services (maps) may also be accessed using
explicit commands - This is very useful for building distributed
database or directory applications that have
nothing to do with system administration (i.e.
systemwide personnel directory, etc.) - The most important NIS user command is ypmatch
this allows the search of a map using a keyword - Three groups of internal NIS procedure calls
perform the work for either mode of operation - Client-Lookups key-driven procedures (match,
get-first, get-next, get-all) - Maintenance-calls checking server names status
(get-master, get-order) - Internal NIS calls commands between servers
such as a map transfer from Master to Slave
27Directory Service Protocols
- The X.500 Directory Service (aka DAP)
- ITU standard for storing, accessing
distributing directory information - What X.500 provides
- A standards-based directory
- A structured information framework
- A single global namespace
- Powerful search capabilities
- Decentralized maintenance
- Encompasses seven recommendations (X.501, X.509,
X.511, X.518, X.519, X.520, and X.521) besides
X.500 the standards cover - Directory Service Quality access rules,
authentication, filters, etc. - Directory Queries read, compare, list, search
operations, etc. - Directory Modification add, rename, modify
operations, etc. - Error Reporting definition of of error
conditions and responses - Referrals relationships between Directory
servers other external objects - Originally designed for OSI protocols but later
modified for TCP/IP
28Directory Service Protocols
- X.500 Technology
- X.500 Directory Structure and Terminology
- Information is held in a Directory Information
Base (DIB) - A DIB consists of individual entries organized in
a tree structure called the Directory Information
Tree (DIT) - Can also have relational characteristics
- Each entry is composed of attributes and has a
Distinguished Name (DN) that uniquely identifies
it (User friendly Distinguished Names are very
important to X.500) - Each attribute composed of a type and one or more
associated values - The type specifies a particular syntax and data
type for the value (boolean, integer, etc.) - DIT entries have some object-oriented
characteristics like inheritance - The DIT also has a schema in X.500 the schema is
a rule-set that ensures the DIT maintains its
logical structure during modifications
29Directory Service Protocols
30Directory Service Protocols
- X.500 Technology
- The X.500 communication protocols
- X.500 client (Directory User Agent or DUA)
communicates with an X.500 server (Directory
Server Agent or DSA) using the Directory Access
Protocol (DAP) - The X.500 DAP uses the standard OSI presentation
layer Remote Operations Service Element (ROSE)
and Association Control Service Element (ASCE)
for communications - RFC 1249 defines a standard for interface the
X.500 service to the UDP and TCP transport layers
for use over the Internet - OSI networking is ridiculously complex and X.500
over TCP/IP isnt much better so thats all were
going to cover on this subject
31Directory Service Protocols
- The Lightweight Directory Access Protocol (LDAP)
- Even with the adaptation of X.500 to TCP/IP very
few organizations actually adopted it - Required too much computing and OSI engineering
expertise - Most implementations were kludgey and complex
- Engineers researchers at the University of
Michigan decided a completely new protocol for
accessing X.500 services from Internet clients
would hasten deployment of global directory
services - The Lightweight Directory Access Protocol (LDAP)
was originally designed to be a gateway
service an X.500 back-end was still necessary - Eventually the designers saw the value of LDAP as
a completely independent directory services
access protocol - LDAP version 3 specifications published as RFC
2252 - Recent update in RFC 4510 provide new features
protocol extensibility
32Directory Service Protocols
- The Lightweight Directory Access Protocol (LDAP)
- What the current version of LDAP provides
- Centralized administrative tasks within an
organization - Storage of sensitive information in a centrally
secure repository - Authentication and security services
- Centralized procedure for determining the status
and role of an individual in an organization - Standardizes the search and retrieval of white
yellow page data - Multi-platform, multi-vendor open standard with
published APIs for software development - URL format specifications for accessing LDAP data
via browsers - Gateway services to other directory services like
NIS and NDS - Vendors with current LDAP enabled products
Redhat, Innosoft, OpenLDAP, Sun, Microsoft, and
Novell (list not exhaustive)
33Directory Service Protocols
- The Lightweight Directory Access Protocol (LDAP)
- LDAP Technology
- LDAP standard addresses 3 areas API, data
format, and access protocol - There are several APIs published and in use
- The base API developed by the U. of Michigan
published as RFC 1823 - Other C and Perl based APIs readily available
- LDAP data format
- LDAP has the same object format and definitions
as X.500 - Directory Information Base (DIB) Directory
Information Tree (DIT) - Entries and Attributes
- LDAP DIB structure includes both hierarchal
relational structure - Four LDAP object types are currently defined
Person, Organizational Unit, Group, and Domain
34Directory Service Protocols
- Example LDAP
- directory structure
35Directory Service Protocols
- The Lightweight Directory Access Protocol (LDAP)
- LDAP access protocol
- Conforms to a client-server architecture (clients
performing operations against servers) - Does not define how data is stored in server
just how to access it in a standard manner - Uses TCP client can issue multiple queries over
a TCP connection - Basic LDAP Operations
- Binding an LDAP client to a server
- Required operation before a client can search
information on a LDAP server - Authentication and access controls are parts of
this operation - Unbinding and rebinding can occur over a single
TCP connection allowing a client to change access
privileges
36Directory Service Protocols
- The Lightweight Directory Access Protocol (LDAP)
- Basic LDAP Operations
- Server search
- Single or multiple keyword
- Wildcards are allowed
- Compare entries
- Add entry or entries
- Modify existing entry or entries
- Delete existing entry or entries
37Directory Service Protocols
- The Lightweight Directory Access Protocol (LDAP)
- Other LDAP Operations
- Referral allows one server to pass searches to
another server - Replication allows multiple servers to handle
requests for a DIT enhances availability via
load balancing and redundancy - Encryption Security supports Kerberos, SASL,
and Secure Sockets Layer (SSL) - DIT discovery allows a client to query a server
for its structure and entity relationships
38Introduction to the World Wide Web
39Introduction to the World Wide Web
- Goals
- Developed to provide an easy way to distribute
info to members of a geographically dispersed
group - Especially well-suited for multimedia
information - Meant to allow software, hardware, system
independent information sharing! - Built on the concept of hypertext and hypermedia
cross-linked groups of documents or objects - Plenty of good Web related information available
at http//www.w3c.org
40Introduction to the World Wide Web
- History
- The WWW concept was developed in 1989 by Tim
Berners-Lee at CERN - He wanted a way to easy distribute high energy
physics data to researchers located around the
world - Concept was developed and culminated in the
release of the first WWW browser in 1993 (Mosaic) - With the incorporation of Netscape commercial
development of the WWW gained momentum - WWW is actually based on an earlier text-based
system called gopher that was developed in the
mid-80s
41Introduction to the World Wide Web
- General Operation Concepts
- WWW has client-server architecture, with browsers
(clients) pulling data from Web servers - Browsers request information (web pages) using a
unique 'address' for each page called a Uniform
Resource Locator (URL) - Servers deliver either an error indication or the
requested web page information to browser - The WWW uses TCP for transport (i.e. one-to-one
communications) - WWW is mostly a 'pull' technology only
client-requested info is sent - Web pages are linked together in a hierarchical
'mesh' by embedding URLs in web pages
hyperlinks - Caching is an integral part of the WWW can take
place in many locations - Server-side in web farms (load balancers
caching appliances) - In the network at firewalls and proxy servers
- Client-side in web browsers
42World Wide Web Addressing
- There are three basic parts to the WWW
Addressing, Transport, Presentation - Addressing (URLs)
- One of most important aspects of the WWW is its
global addressing scheme called the Uniform
Resource Locator (URL) - URLs provide a unique address allowing direct
access to web pages - URLs consist of three parts protocol, host
name, and document name - The URL Protocol part
- Web browsers can handle multiple protocols to
provide backwards compatibility flexibility - Depending on the browser these additional
protocols can be handled internally in the
browser or through the use of 'helper'
applications - Additional protocols handled telnet, ftp,
gopher, news (NNTP), mail (SMTP), and local file
access
43World Wide Web Addressing
- The URL Host Name part
- Second part is the DNS name or IP address of the
server where the web page resides - Use of IP addresses is valid and works but is
discouraged - If no TCP port number is specified with the name
or address, port 80 is implied - If you wish to run web services on another port
the port is specified after the DNS name
(example http//www.noname.org7777/admin/system
.html) - The URL document name
- Final part of URL specifies the location of the
web page on the server - Usually consists of directory path followed by
web page filename - Path is typically relative to a default directory
on the web server (operating system dependent) - Path structure and filenames are operating system
specific
44Transport The Hypertext Transfer Protocol (HTTP)
- Introduction
- The Hypertext Transfer Protocol is the
application layer protocol responsible for
delivery of WWW data between browsers servers - Current HTTP release is version 1.1 (RFC 2616)
though a significant number of systems run the
first standard 'production' release (v1.0) - HTTP is a very simple text based request-response
protocol like SMTP MIME like header fields and
encoding conventions are used - RFC-822/2822 format headers and MIME extensions
are used to define the web page content,
negotiate page parameters, support conditional
requests - HTTP uses TCP for reliable transport layer
services - HTTP v1.0 works in a stop and wait fashion - only
one HTTP command can be outstanding requires a
separate TCP connection for each object - For each object (file) on a page a separate TCP
connection is established - HTTP v1.1 allows persistent connections and
pipelining
45Transport The Hypertext Transfer Protocol (HTTP)
- Important HTTP Characteristics
- Application Level
- Request/Response
- Stateless
- Bi-directional Transfer
- Capability Negotiation
- Support for Caching
- Support for Intermediaries/Proxies
46Transport The Hypertext Transfer Protocol (HTTP)
- Important HTTP commands
- GET transfers a file from server to browser
- HEAD transfers the HTML header information for
a web page from server to browser - PUT uploads a file from browser to server (used
a lot with web-based mailers) - POST append information to a URL (used with
forms) - LINK create a link between files on a web
server - UNLINK delete a link between files on a web
server - DELETE delete a file on the server
- ALL of these commands are subject to access
restrictions enforced by both the web server
software and the server operating system!
47Transport The Hypertext Transfer Protocol (HTTP)
48Transport The Hypertext Transfer Protocol (HTTP)
- HTTP v1.1 Technical Details
- Persistent TCP Connections
- Allow multiple HTTP commands (and responses) to
be sent over a single TCP connection can greatly
increase response times and interactivity - Reduces load opening closing TCP connections
- Fall back to v1.0 if necessary
- Application-layer compression
- Does not resend HTTP headers on sequential
requests to the same server - HTTP Command Pipelining
- Multiple HTTP commands can be issued without a
response - The server must respond to each request in the
order it was received - Expanded and improved directives for more
intelligent Web caching - Enhanced support for Web content lifetimes and
expiration dates - Validation algorithms commands to allow a cache
to check on web page expiration - Better support for transfer of dynamically
generated web pages - Chunked transfers files sent as a sequential
stream of chunks with a defined size - HTTP 1.1 has explicit header fields for defining
chunk mode transfers and expressing the size of
the chunks
49Presentation The Hypertext Markup Language
(HTML)
- Introduction
- HTML is concerned with the structure and
formatting of web pages - HTML was developed from an ISO standard called
SGML a metalanguage developed in 1974 (you
may ask why SGML isnt used for the WWW instead
of writing a new language its because SGML is
very general and terribly complex) - Current version is 4.01 (December 1999)
- HTML is designed to be backwards compatible to
previous versions - It was the original objective of HTML to allow
the browser to control formatting presentation
of page - This was instituted to allow better data sharing
in a heterogeneous environment - Content providers want more control over the
presentation of their information so the original
object of browser control over presentation is
being gradually eroded in each new release of HTML
50Presentation The Hypertext Markup Language
(HTML)
- HTML page structure
- An HTML page consists of ASCII text and is
divided into two main parts a header and a body - The header contains information about the
document - The body contains the actual information to
display - HTML depends on embedded commands in the web page
called TAGS - Tags define both a command action and a scope
- Tags are enclosed in start and end symbols (lt
and gt) to distinguish them from normal text
that is for client display - Tags, depending on their specific actions, can
have optional attributes (size, align, action,
etc.)
51Presentation The Hypertext Markup Language
(HTML)
- Basic tags
- Page tags ltHTMLgt and lt/HTMLgt tags delimit the
complete page - Header and Body tags ltHEADERgt, lt/HEADERgt,
ltBODYgt, and lt/BODYgt - Hyperlinks ltA HREFURLgt and lt/Agt
- Images ltIMG srcURLgt and lt/IMGgt
- JPEG format (.jpg)
- GIF format (.gif)
- Formatting
- Physical formatting
- Specifies an exact display parameter for
rendering web text - Examples are Bold (ltBgt and lt/Bgt) and Font Size
- Logical formatting
- Specifies special treatment of the enclosed text
but the exact physical rendering is left up to
the browser or defined in a style sheet
52Presentation The Hypertext Markup Language
(HTML)
- Style Sheets
- Provide a way to uniformly define how logical
styles are assigned physical display attributes
(without changing the configuration of your
browser) - Allows the web page author to apply typographic
styles and spacing instructions for elements on a
web page - The advantages of style sheets are that they
remove presentation information from the HTML
document (which should be concerned only with
structure) and one style sheet could be reused
by linking it to multiple HTML documents - The disadvantage of style sheets is that they
take presentation control away from the browser
-- if the content creator is not careful their
web pages may not display properly - Style sheets can be embedded into a web page or
linked externally through a link or import
statement in the header - Generic Style Sheet entry syntax (example later)
selector propertyvalue
53Presentation The Hypertext Markup Language
(HTML)
- More Tags
- Menus, lists, and tables
- Menus
- Creates a compact list of choices with no
numbering or bullets - Relevant tags ltMENUgt, lt/MENUgt, and ltLIgt
- Lists
- Ordered Lists
- Ordered lists sequentially list items
- Relevant tags ltOLgt, lt/OLgt, and ltLIgt
- Unordered Lists
- Items are listed with bullets
- Relevant tags ltULgt, lt/ULgt, and ltLIgt
54Presentation The Hypertext Markup Language
(HTML)
- More Tags
- Tables
- Creates a 2-D collection of data organized in
rows and columns - The ltTABLEgt and lt/TABLEgt tags bound the table
data - The rows are specified by ltTRgt tags
- Captions can be added using the ltCAPTIONgt tag
- Cell data delimited by the ltTDgt or ltTHgt tags
- Headings
- HTML Headings provide a set of logical styles
for use in documents - These are logical styles actual display
properties set by the browser or a style sheet! - Relevant tags ltH1gt through ltH6gt with end tags
lt/H1gt through lt/H6gt - Paragraphs, Line Breaks, Horizontal Rules
- These are the most commonly used tags that do not
come in pairs - Relevant Tags Paragraph ltPgt, Line Break ltBRgt,
Horizontal Rule ltHRgt
55Presentation The Hypertext Markup Language
(HTML)
- Example HTML Page
- lt!doctype html public "-//w3c//dtd html 4.0
transitional//en"gt - lthtmlgtltheadgt
- ltmeta http-equiv"Content-Type"
content"text/html charsetiso-8859-1"gt - ltmeta name"Author" content"John Romano"gt
- ltmeta name"GENERATOR" content"Mozilla/4.5
en (Win95 U) Netscape"gt - lttitlegtSample Web Pagelt/titlegt
- lt/headgtltbodygt
- lth4gtThis is a sample Web Page for my Internet
Classlt/h4gt - lth5gtI'll start off with a list of what toys I
want for Christmaslt/h5gt - ltolgt
- ltligtA trip to Tahitilt/ligt
- ltligtA fast expensive convertiblelt/ligt
- lt/olgt
- lthr WIDTH"100"gtltimg SRC"bix2b.JPG" height86
width144gt - lth5gtHere's a table for comparisonlt/h5gtnbsp
- ltcentergtlttable BORDER COLS3 WIDTH"100" gtlttrgt
56Presentation The Hypertext Markup Language
(HTML)
- Advanced HTML
- Forms
- Basic HTML is one way information transfer
between server and browser interactivity is very
important and was added in version 2.0 - Forms allow data to be pushed back to the server
for processing (via some other specification such
as CGI) - Forms delimited using the ltFORMgt and lt/FORMgt tags
- The Method and Action attributes of the FORM tag
used to specify how and where to upload the form
data - Example ltFORM ACTION http//www.xyz.com/cgi-bin
/order METHODPOSTgt - Individual fields in the form specified using the
ltINPUTgt tag - The ltINPUTgt tag has many attributes to control
labeling and data input - The Name attribute
- The Size attribute
- The Type attribute
57Presentation The Hypertext Markup Language
(HTML)
- Cascading Style Sheets
- A new feature that imbues style sheets in object
oriented functionality - Style sheets can be linked together in a tree
structure - Style sheets can inherent properties from parent
to child - Cascading style sheets sometimes have different
behaviors between browsers!
58A Look Ahead Addressing
- Addressing (URIs)
- A Uniform Resource Identifier is a formatted
string which identifies, via name, location, or
any other unique property, a resource - URLs are a subset of URIs that specify a location
this restriction can be a problem - URIs will in some sense allow anycasting of
information across the WWW - Will allow requests for information without
worrying about where it came from - Will help eliminate problems with network and
server outages - URI composition
- A URI is composed of four major parts scheme,
authority, path, and query - The URI syntax is ltschemegt//ltauthoritygtltpathgt?lt
querygt
59A Look Ahead Addressing
- URI composition (Continued)
- Scheme
- Only the scheme part of the URI is mandatory
other parts are optional - Could identify a protocol or some other defined
or standardized namespace - The remainder of the URI syntax depends on the
scheme! - Authority
- May be defined as a Internet-based service or by
a Scheme-specific naming authority - Internet based-servers usually have the form
ltuserinfogt_at_lthostgtltportgt - Userinfo is optional but must be composed of
legal URI characters - Host would typically be a valid DNS name or IPv4
address - Port would be a valid TCP port number
- A scheme specific naming authority could be
anything valid in the syntax of the scheme as
long as it is composed of legal URI characters
60A Look Ahead Addressing
- URI composition (Continued)
- Path
- The path component contains data, specific to the
scheme and authority, that identifies the
resource within the scope of the scheme and
authority - Typical paths in an Internet server-based URI
(like a URL) navigate the directory structure of
the server - Query
- The query component is a string of information to
be interpreted by the resource - Typically used to provide keywords or parameters
for formatting output
61A Look Ahead - Presentation
- The Extensible Markup Language (XML)
- A new way to structure and format data on the
Internet - Goal make information transmitted across the
Internet self-describing - Consists of a ruleset to describe how to define
your own markup language - Like HTML and other markup languages tags are
used -- however they are there more to describe
what the information is instead of how it should
look - XML History
- Development of XML began in 1996 and culminated
in early 1998 with the adoption of version 1.0
standard by the World Wide Web Consortium (W3C) - Like HTML, XML is based on the Standardized
General Markup Language (SGML) developed by the
ISO in the mid-1970s - SGML is far too general and complex (the standard
runs over 500 pages) for easy and widespread
deployment XML is a stripped down version
suitable for use on a wide range of networked
systems - HTML has been re-written into XML (aka XHTML
1.1)
62A Look Ahead - XML
- HTML shortcomings
- Even with the vast bandwidth in the Internet the
WWW is often plagued by lack of interactivity and
poor response times - Explosive growth of the WWW has made it difficult
to find what youre looking for (metatags web
crawlers dont cut it anymore) - HTML is inflexible it takes months to years for
new standards with additional tags and
functionality to be published - XML solutions to HTML problems
- Speed with tags specifying what the delivered
content means and what should be done to it a
significant amount of processing can be done on
the client - Saves network bandwidth
- Offloads overloaded servers
- Searching XML makes searches more intelligent
by allowing the structure and meaning of data to
be included as keywords or search parameters - Flexibility XML allows a content creator to
define tags in a standard way so new tags can be
created on the fly without rewriting standards
63A Look Ahead - XML
- XML Implementation
- The XML standard is, by comparison to SGML or
HTML, a very concise document - XML uses Unicode a standard set of characters
that encompasses all of the worlds major
languages - Like HTML, XML is built on tags
- Tags must be in start/end pairs that enclose the
text to which they apply - Tags cannot overlap, but they can be nested
- Nested tags create a tree structure for the XML
document - Information typically required in a new XML
document - What tags are allowed
- How they can and cannot be nested
- How tags should be processed
- The first two of the preceding three items are
usually covered in the Document Type Definition
(DTD)
64A Look Ahead - XML
- A Real World XML Document (A RealServer license
key) - ltLicenseKeyDefinitiongt
- lt!-- Warning Do not Edit this file! Editing
will invalidate your Real Server License Key. --gt - ltList Name"License"gt
- ltList Name"Definition"gt
- ltVar Name"Evaluation" Value"True"/gt
- ltVar Name"Manufacturer"
Value"RealNetworks"/gt - ltVar Name"LicenseID"
Value"01-0102-0031-57074"/gt - ltVar Name"ProductID" Value"0"/gt
- ltVar Name"MajorVersion" Value"6"/gt
- ltVar Name"MinorVersion" Value"0"/gt
- ltVar Name"StartDate"
Value"01/01/1997"/gt - ltVar Name"EndDate"
Value"01/01/2030"/gt - lt/Listgt
- lt/Listgt
- ltList Name"General"gt
- ltVar Name"ClientConnections"
Value"10"/gt - ltVar Name"Live" Value"True"/gt
- lt/Listgt
65A Look Ahead - XSL
- Other important parts of XML
- The Extensible Stylesheet Language (XSL)
- Defines a set of rules for presentation of XML
data allowing the development of a write once,
publish everywhere information infrastructure - XSL stylesheets can be defined for various
display media when an XML document is downloaded
the appropriate XSL stylesheet is used to
reformat the data to display data on the local
system - XSL stylesheets can be defined for presentation
of data in non-visual formats using XSL an XML
web page could be listened to or converted to
braille - XSL/XML document use
XML Document (Structure)
XSL StyleSheet (Presention)
Users display at Browser
XML Document 2 (Structure)
XSL StyleSheet 2 (Presention)
66A Look Ahead XML XSL
- XML XSL Document Example
- lt?xml version 1.0 ?gt
- lt?xml-stylesheet typetext/xsl
hrefeditors.xsl ?gt - lt!-- This is a partial DTD - I havent defined
everything necessary for this example! --gt - lt!ELEMENT street (PCDATA)gt
- lt!ELEMENT city (PCDATA)gt
- lt!ELEMENT state (PCDATA)gt
- lt!ELEMENT zip (PCDATA)gt
- lt!ELEMENT address (street, city, state, zip)gt
- lteditor_contactsgt
- lteditorgt
- ltfirst_namegtJonathanlt/first_namegt
- ltlast_namegtSmithlt/last_namegt
- lttitlegtSenior Engineerlt/titlegt
- ltorganizationgtJohns Hopkinslt/organizationgt
- ltaddressgt
- ltstreetgt139 N Charles Streetlt/streetgt
67A Look Ahead XML XSL
- XML XSL Document Example
- lt?xml version 1.0 ?gt
- lt!-- This stylesheet takes a simple XML doc and
displays it in HTML --gt - ltxslstylesheet xmlnsxslhttp//www.w3.org/TR/WD
-xslgt - lt!- this defines the xsl namespace this
stylesheet belongs to -!gt - ltxsltemplate match/gt
- lt!- Apply template to everything in the XML
document -!gt - ltHTMLgt
- ltBODYgt
- ltH1gtEditor Contactslt/H1gt
- ltxslfor-each selecteditor_contacts/editorgt
- ltH2gtName ltxslvalue-of selectfirst_name/gt
ltxslvalue-of selectlast_name/gt lt/H2gt\ - ltPgt Title ltxslvalue-of selecttitle/gtlt/Pgt
- ltPgt Title ltxslvalue-of selectorganization/gtlt
/Pgt - ltPgt Title ltxslvalue-of selectaddress/street/
gtlt/Pgt - ltPgt Title ltxslvalue-of selectaddress/city/gtlt
/Pgt - ltPgt Title ltxslvalue-of selectaddress/state/gt
lt/Pgt - ltPgt Title ltxslvalue-of selectaddress/zip/gtlt/
Pgt - ltPgt Title ltxslvalue-of selecte_mail/gtlt/Pgt
68A Look Ahead - XLink
- Improved Hyperlinks (XLink)
- Another W3C standard to provide more intelligent
hyperlinks - Version 1.0 approved June 2001
- Unlike current hyperlinks that can only link to
another single physical location XLinks will be
able to do much more - XLinks could allow a choice of multiple actions
or destinations - Other XLinks will allow the information to be
embedded directly in the page (maybe a floating
dialog box) instead of forcing the viewer to
leave that page - XLinks could be indirect links, allowing changes
to links to be made at a single database record
instead of where ever the link is referenced
69Reading Homework
- Reading
- Chapter 25 25.3, 25.13, and 25.14
- Chapter 27 HTTP
- Theres a lot more to the WWW XML than
presented here see http//www.w3c.org for more! - Next week advanced web applications