Title: Web Technology
1Chapter 3
2Web Publishing
- Static documents
- HTML, ASCII text, Postscript, PDF
- GIF, JPEG, MOV, Quicktime, AVI
- AU, WAV, MP3, RealAudio
- Dynamic documents
- executable content
- Java, Javascript, Active-X, Dynamic HTML
- Variable documents
- Dynamically-generated (on-the-fly)
- CGI, FastCGI, JSP, PHP
3HyperText Markup Language
- Document structure description
- (sub-)sections
- headings
- tables
- No (?) layout information
- style sheets
- font mapping
- Defined in SGML / XML (XHTML)
- Document Type Definition (DTD)
4HTML Basic Elements
- ltHTMLgt
- ltHEADgt
- ltTITLEgtHello Worldlt/HTMLgt
- lt/HEADgt
- ltBODYgt
- Oh, what a beautiful morning....
- lt/BODYgt
- lt/HTMLgt
5HTML Elements
- physical text styles
- logical test styles
- test segmentation
- tables
- inline pictures
- anchors/links
- forms
- specials
- image maps (client/server)
- background pic/audio, marquees
6HTML Development
- Browser war (NS Navigator vs. MSIE)
- World Wide Web Consortium (W3C)
- rendering
- internationalization
- forms
- active content
- object models
7HTML History
- HTML 2.0
- 1st version conforming to SGML
- Core HTML (lists, forms, headings, fonts, image
maps, ...) - HTML 3.2 (no HTML 3.0 HTML 3.1)
- many elements get additional attributes
- font sizes, font faces, colors
- form-based file upload
- client-side image maps
- tables, APPLET tag
- STYLE and SCRIPT tags (placeholders, no exact
mechanisms)
8HTML 3.2
- Script Tags
- JavaScript, VBScript, Jscript
- eventless
- APPLET tag
- Styles (CSS)
- ALIGNs generalized
- client-side image-maps
- FONT
- formulas
- FORM based file upload
9HTML 4.0
- accessability
- labels, legend, clusters
- iterators, access keys
- character encodings, language codes (lang)
- bidrectional text (dir)
- incremental rendering
- tables, images
- objects (nested)
- scripts and events
- frames
10HTML 4.0 (Frames)
- TARGETs
- ltA href"slide2.html" target"dynamic"gtslide
2.lt/Agt - IFRAMEs
- ltIFRAME src"foo.html" width"400" height"500"
scrolling"auto" frameborder"1"gt - Your user agent does not support frames
- lt/IFRAMEgt
11HTML 4.0 (Scripts)
- client-side execution at
- loading
- user input (focus, pointer movement)
- language selection
- ltMETA http-equiv"Content-Script-Type"
content"type"gt - ltSCRIPT type"text/vbscript" src"http//someplace
.com/progs/vbcalc"gt - lt/SCRIPTgt
- evaluation order, object modification (DOM)
12HTML 4.0 (Events)
- Loading
- onload, unload
- Pointer
- onmousemove, onmousmoveover, ...
- Keyboard
- onkeypress, onkeydown, ...
- Form handling
- onsubmit, onreset, onselect, onchange
- ltINPUT NAME"userName" onblur"validUserName(this.
value)"gt
13HTML 4.01
- Common attributes for many elements Id, Title,
Style, Class - Improved tables (Thead, Tbody, Tfoot, column
handling, formatting), Incremental rendering of
tables - Improved forms
- STYLE element now encloses style sheet
instructions (CSS) - Style sheets separation of content and styles
(CSS)
14HTML 4.01
- Frames, Embedded documents (IFRAME)
- APPLET element deprecated (OBJECT)
- Intrinsic events (onclick, ondblclick,
onmouseover, onmouseout, ...) - OBJECT element (IMAGE, APPLET, IFRAME)
- SCRIPT element (client-side scripting -
JavaScript, VBScript, ...)
15OBJECT Element Examples
16More OBJECT examples
- ltOBJECT data"canyon.png" type"image/png"gt
- This is a ltEMgtcloseuplt/EMgt of the Grand
Canyon. - lt/OBJECTgt
- ltOBJECT classid"http//www.miamachina.it/clock.py
"gt - An animated clock.
- lt/OBJECTgt
- ltOBJECT data"embed_me.html"gt
- Warning embed_me.html could not be included.
- lt/OBJECTgt
17Client-side Scripting 1
- Languages JavaScript, VBScript, Python, Tcl, ...
- Security ?
18Client-side Scripting 2
19Client-side Scripting 3
20Cascading Style Sheets
- Separation of content (HTML, XML documents) and
presentation style (CSS) - simplified Web authoring
- easier Web site maintenance
- CSS vs. XSL
- CSS was defined earlier
- XSL is still a draft while CSS is already
supported by browsers - XSL is more powerful gt too complex for many
users/applications
21Style and HTML
22External CSS and Cascasding
23Without Style
24Better do it stylish !
25CSS by Example (1/3)
26CSS by Example (2/3)
27CSS by Example (3/3)
28Extensible Markup Language
- XML (1998) is an application of SGML
- Standard Generalized Markup Language (1986)
ISO8879 - influenced by HTML (SGML Document Type
Definifion) - Structure description language
- Meta-language language to describe other
languages - Tags enclose identifiable parts of a document
- markup (type-setting systems)
29XML Example
- ltwarninggt
- ltparagtThis substance is ltemphgthazardouslt/emphgt to
healthlt/paragt - ltparagtSee procedure 12A.7 for information on
protective clothing.lt/paragt - ltimage .../gt
- lt/warninggt
30XML Documents
logical structure division of documents into name units and sub-units unit element physical structure components of the document can be named and stored separately component entity
Document
Document
Unit
Sub-unit
31Document Type Definition
- DTD defines the elements allowed
- A parser compares the DTD rules against a given
XML document gt validation - XML DTDs can be applied for data-type definitions
(XML-RPC), data exchange (EDI, push, RDBMS), etc.
32XML Document Presentation
- Style sheets specify output format
- 1 XML document, n alternative style sheets
depending on audience, media, etc.
WARNING
This substance is hazardous to health See
procedure 12A.7 for information on protective
clothing.
33XML by Example (1/2)
34XML by Example (2/2)
35Extensible Stylesheet Language
- XSL is a language for expressing stylesheets and
consists of - a language for transforming XML documents (XSLT)
- an XML vocabulary for specifying formatting
semantics (Formatting Objects DTD - FO DTD) - CSS lt XSL lt DSSSL (SGML's Document Style
Semantics and Specification Language) - Style sheets
- target specific elements gt closely related with
DTDs
36XSL and XSLT Processing
XSL processor
XSLT stylesheet
XSLT processor
source DTD
document
new document
FO DTD
37Document Reengineering
- analysis
- data sources, responsibilities and update
dynamics - data model
- EER (extended entity-relationship model)
- mapping logical -gt physical
- reverse association how to link back
- forward association how to link forward
38HTML Code Creation
- Editors
- Tags, Syntax
- Validators
- Halsoft, htmllint, ....
- Converters
- which HTML? DTD as parameter
- how to map the document structure into HTML?
- special symbols? mathematical formulas?
- what happens with hyperlinks in original document?
39HTML Creation (cont'd)
- Development Environments
- versioning
- staging
- TODO database
- link consistency
- upload to server
- integration of functionality
- CGIs
- backend applications, databases
- client-side scripting
40Webserver State maintenance
- HTTP interactions are "isolated", i.e., HTTP does
not include means to hand over state information
between interactions -gt difficult - Advanced web applications, e.g. shopping basket,
require that state can be shared between
interactions (between web client and web server) - External apps have their own state space
41WWW Gateways
HTTP
Web browser
Web server
CGI (HTTP)
WWW
CGI gateway
Gateway-specific protocol
non WWW
Application
42Stateful Gateways
- A permanently running gateway process keeps up a
connection with the external application and
serves successive HTTP requests, i.e. the gateway
maintains the session?s state. - Problem state bookkeeping
- client caches
- back button
- interrupted requests (recover ?)
- time-out for follow-up requests (bound resources
?) - Example DBs (expensive login)
43Stateless Gateways
- Gateway or external application generate
state-information which is stored at the client
and sent with every request. - State can be stored in
- URLs
- hidden fields
- Cookies
- Solve state consistency problem ?
44State in URLs / Hidden Fields
- State information can become large
- User can change state information (reservation ?)
- Sessions may have to be replayed until the state
for the next step is reached - Unreadable URLs are no solution
- Passwords ?
45Cookies
- A cookie is a small data structure which holds
name, value pairs which is sent back and forth
between web client and web server for certain
URLs - Several incompatible "standards"
- original standard by Netscape (Set-Cookie)
- RFC 2109 (Set-Cookie)
- New Internet Draft (Set-Cookie2)
46Cookie example
User shops around and gathers 2 items in his
shopping basket (server -gt client)
User decides to buy the 2 items and
selects http//www.supershop.com/cgi-bin/order/buy
.pl (client -gt server)
47More about Cookies
- Cookies can enhance or break privacy
- tracking vs. no user database
- Cookies are kept in memory
- Persistent cookies cookies.txt file
creator (domain)
path
expiration
name
value
access for all hosts in domain ?
require secure connection ?
48Server-side Includes
- .shtml files are parsed for special commands
- executed before the file is sent to the client
49SSI Syntax
Main elements
exec execute a shell or CGI script
fsize print the size of a given file/URL
flastmod print last modification date
include insert contents of another document
set set the value of a variable
Variables (additionally to CGI environment
variables) DATE_GMT, DOCUMENT_NAME,
DOCUMENT_URI, LAST_MODIFIED
Flow control lt!--if expr"test
condition" --gt lt!--elif expr"test
condition" --gt lt!--else --gt
lt!--endif --gt
50SSI problems
- Performance
- parsing
- command execution
- Security
- exec command, etc.
- IncludesNoExec
- Unreadable -gt maintenance ?
51SSI example
52PHP
- HTML-embedded scripting language
- Syntax similar to C/Perl/Java
- Powerful features
- PHP scripts can replace CGIs
- Powerful support for many databases
- Support for HTTP, IMAP, SNMP, NNTP, POP3
- Access to raw sockets
- Security ?
53PHP DB access example
54Simple PHP example
55JavaServer Pages
- JSP "Java-based ASP"
- Dynamic scripting
- JSP pages can contain
- HTML/XML
- Java scriplets
- special tags
- JSP engine required
56JSP example
57Common Gateway Interface
- Standard for interfacing external applications
with web servers (databases, etc.) - CGI programs execute at the web server and
produce output dynamically ("HTML document does
not exist in advance") - Language-independent CGI programs can be written
in any programming language (Perl, C, Java, Tcl,
...)
58Simple CGI Example (1/3)
- GET
- CGI script is started by the web server
- gets the parameters as environment variable
QUERY_STRING (URL encoded) - QUERY_STRING"par1hellopar2world"
59Simple CGI Example (2/3)
- POST
- CGI script is started by the web server and gets
the parameters via standard input (URL encoded) - CGI script does not get EOF but must check the
environment variable CONTENT_LENGTH how much to
read from stdin
60Simple CGI Example (3/3)
- MyScript.pl print all set environment variables
61FastCGI
- Migrating of CGI programs is very easy
- Performance
- persistent (multi-threaded) processes
- after finishing a request the processes continue
to run and wait for new requests - Support for distributed computing
- FastCGI programs can run remotely
- load distribution
62CGI vs. APIs
- CGI
- simple
- language independent
- isolated processes
- open standard
- architecture independent
- single/multi-threaded
- single/multi-tier
63Servlets
- Modules (pieces of code) which run on
Java-enabled web servers - Servlet engine execution environment
- Java Servlet API Specification (SUN)
- Java package javax.servlet
Servlet engine
Load
Servlet code
Servlet engine
Client
Servlet code
Client
Servlet engine
Unload
Servlet code
64Servlets vs. CGI
- Portable code
- Useful features (sessions, cookies, ...)
- Multi-threading
- Faster
- does not run as a separate process
- stays in memory between requests
- a single servlet instance answer request
concurrently - Sandbox for servlets possible
65Apache Jserv Servlet Engine
- Java Servlet API 2.0 compliant
- Multi-threaded, separated from web server
- Integrated load balancing features
- Smart redirection of requests
- AJP protocol allows creation of complex
distributed applications - MD5 based connection authentication and IP
filtering