Learning outcomes
  • Explain in general terms how web documents are
    transferred across the Internet and
  • What processes are triggered when you click on
  • Code web pages using HTML and XHTML using style
  • Explain why it is advisable to use XHTML rather
    than HTML
  • Describe some technologies available for dynamic
    web pages

Essential Reading
  • Joe Casad, Teach yourself TCP/IP, Ch. 17
  • William Buchanan, Mastering The Internet, Ch. 6-8
  • Introductory materials on HTML XHTML either a
    text book such as
  • John Shelly, HTML AND CSS explained, or
  • http/
  • http//

Additional reading
  • William Buchanan, Mastering The Internet, Ch.
  • Andrew Tanenbaum, Computer Networks, Ch. 7.3
  • Douglas Comer, Computer Netwoks and Networking,
    ch. 32-33
  • Chuck Masciano and Bill Kennedy, HTML and XHTML
    the definitive guide, for reference
  • http//
  • Mike Lewis, Understanding Javascript,
    June-Jully 2000

How the web works
  • The client-server model
  • Client and server operate on machines which are
    able to communicate through a network
  • The server waits for requests from a clients
  • Server receives a requests from a client
  • Performs a the requested work
  • Or lookup the requested data
  • And send a response to the client
  • Servers file servers, web servers, name servers
  • Clients browsers, email clients

url format
  • ltschemegt//ltserver-domain-namegt/ltpathmanegt
  • ltschemegt which protocol to use
  • http in general
  • file which tells the client document is in a
    local machine
  • ftp file transfer protocol
  • ltserver-domain-namegt identifies the server system
  • i.e.
  • ltpathnamegt tells the server where to find the
  • http//

Web browsers and servers
  • A browser is a program that can retrieve files
    from the world wide web and render text, images,
    or sounds encoded in the files.
  • i.e. IE, Nescape, Mozilla
  • A web server is an application which waits for
    client requests, fetches requested documents from
    disk and transmits them the client.
  • i.e Apache

What happened when you click on hyperlink?
  • Determine URL and extract domaine name.
  • Use the name server to get IP address (DNS)
  • Make a TCP connect to port 80
  • And send a request for a web page once the server
    has accepted to connection.
  • The server send the file and releases the TCP
  • The client displays the document.

Other possibilities
  • The steps in the previous slide are for
    displaying a static web page from a remote
  • Other possibilities are
  • Page is loaded from a local system
  • no tcp connection
  • url begin with file//...
  • The page is dynamically generated by a
    client-side script
  • No tcp connection
  • The page is dynamically generated by a
    server-side script
  • The server may carry out other functions
  • Secure server
  • Check users identity if they are authorised to
    access a particular resources

Stateless connection
  • Both client and server release TCP connection
    after a page has been transferred.
  • HTTP1.0 is stateless
  • Connections are not persistent
  • There is no indication to the server whether new
    transactions involve the same client
  • HTTP 1.1 is persistent
  • By keeping track of the client IP addresses
  • However, there is no way of identifying a
    repeated visits to the site by the same user.
  • Futhermore, ISPs reallocate IP addresses to
    dial-up customers as new user dial in.

  • Request the browser to store a small data file
    (cookie) on the users hard disk.
  • Which can serve to identify users only.
  • For instance it could contain a key into a
    database on the server machine.
  • Most browsers nowadays allow you to decide
    whether or not you want cookies on your machine.

Introduction to HTML
What is an HTML File?
  • HTML stands for HyperText Markup Language
  • An HTML file is a text file containing small
    markup tags
  • The markup tags tell the Web browser how to
    display the page
  • An HTML file must have an htm or html file
  • An HTML file can be created using a simple text

Internet - Services
  • Email MIME (Multipurpose Internet Mail
    Extensions)text (text/html), image, video, etc.
  • Telnet ssh
  • FTP File Transfer Protocol
  • Gopher
  • IRC Internet Relay Chat
  • Newsgroups
  • WWW World Wide WebHTTP (Hypertext transfer
    protocol) uses a Question-Answer-Scheme, i.e. a
    browser sends a request und gets a response from
    a server. Note the server does not send out
    anything without a request.

Markup languages
  • Suppose we have a document containing only plain
  • We tag certain parts of the document to indicate
    what they are and how they should be formatted
  • This procedure is called marking-up the document
  • Tags are usually paired
  • e.g. lttitlegtMy Memoirslt/titlegt
  • A pair of tags plus their content constitute an
  • Un-paired tags are called empty tags

Markup languages
  • Physical vs Semantic markup
  • physical refers to appearance (style) on the page
  • semantic refers to structure and meaning
  • HTML is the HyperText Markup Language
  • HTML is based on SGML (Standard Generalised
    Markup Language) which is more complex
  • HTML has a fixed set of tags but is constantly
    evolving, but newer versions are downward

Markup languages
  • HTML places primary emphasis on structure
  • paragraphs, headings, lists, images, links, .
  • HTML places secondary emphasis on style (CSS)?
  • fonts, colours, .
  • HTML does not label the meaning of the text (XML)?

A basic document
  • Every document should start with the following

  • There are three required elements, defined by the
    tags lthtmlgt, ltheadgt and ltbodygt

lthtmlgt ltheadgt lttitlegtMy Home Pagelt/titlegt
lt/headgt ltbodygt lth1gtWelcomelt/h1gt
lt/bodygt lt/htmlgt
Basic structure elements
  • first and last tags
  • The HEAD section
  • must come before the BODY section
  • contains generic information about the document
  • Elements specified in the HEAD section can
  • title, link, script, style
  • The BODY section
  • contains the content of the document (text,
    images etc)
  • this content is structured by other tags

Block elements
  • Block elements define sections of text, usually
    preceded by a blank line
  • ltpgtlt/pgt - paragraph
  • lth1gtlt/h1gt...lth6gtlt/h6gt - headings
  • ltpregtlt/pregt - preserve (original format)?
  • ltblockquotegtlt/blockquotegt - indented text
  • ltdivgtlt/divgt - division
  • used to identify a section of the document that
    may be subject to special formatting (for
    example, using stylesheets).

  • Paragraphs
  • force a break between the enclosed text and the
    text surrounding it
  • the tagged region of text may be subject to
    special formatting
  • ltp align"center"gtHere is another paragraphlt/pgt
  • align is an attribute of the paragraph tag
  • center is the value of the align attribute

ltpgthere is a piece of text that has been placed
inside a paragraphlt/pgt ltp align"center"gtHere is
another paragraphlt/pgt
  • Six levels of importance lth1gt...lth6gt
  • Use headings to divide document into sections

lthtmlgt ltheadgt lttitlegtHeadingslt/titlegt
lt/headgt ltbodygt lth2gtChapter 1lt/h2gt lth3gt1.
Introductionlt/h3gt This is the introduction
lth3gt2. Next sectionlt/h3gt This is the next
section lth4gt2.1 A subsectionlt/h4gt This is a
subsection lt/bodygt lt/htmlgt
Element relationships
  • The elements marked by tags form a hierarchy
  • The root element is html (marked by
  • It usually has two children head and body
  • each of these are further subdivided
  • There are rules for which elements can contain
    other elements
  • e.g. headers cannot contain headers
  • see http// for a full list of rules
  • Elements must not overlap each other
  • we cannot have ... lt/
  • we can have ... lt/

Inline descriptive elements
  • Descriptive elements affect the appearance of
    text depending on how the text is described
  • ltemgtlt/emgt emphasis, usually with italics
  • ltstronggtlt/stronggt strong, usually with bold
  • ltcitegtlt/citegt citation, usually in italics
  • ltcodegtlt/codegt usually results in monotype spacing

ltbodygt A ltemgtfascinatinglt/emgt subject that I
ltstronggtmustlt/stronggt understand lt/bodygt
Inline explicit style elements
  • ltboldfacegtlt/boldfacegt
  • ltbiggtlt/biggt bigger font than surrounding text
  • ltsmallgtlt/smallgt smaller font than surrounding
  • ltigtlt/igt italics
  • ltsgtlt/sgt strikethrough
  • ltsubgtlt/subgt subscripts
  • ltsupgtlt/supgt superscripts
  • ltspangtlt/spangt delimits text for stylesheet
  • ltdivgtlt/divgt delimits blocks of text for
    stylesheet control

Inline explicit style elements
  • ltfontgt attributes
  • face - name of font (must be installed)?
  • "arial", "times", "verdana", "helvetica"
  • size - absolute size (1-7), or relative to
    previous text
  • "2", "5", "7", "1", "-2"...
  • color - hexadecimal RGB, or a named color
  • "3399dd", "blue", "red"
  • weight - boldness from 100, 200, ..., 900
  • "100", "300", "900"
  • e.g.

ltfont face"arial" size"1" color"pink"
Ordered and Unordered Lists
some normal text ltolgt ltligtappleslt/ligt ltligtorangeslt
/ligt ltligtpearslt/ligt ltligtbananaslt/ligt lt/olgt
some normal text ltulgt ltligtappleslt/ligt ltligtorangeslt
/ligt ltligtpearslt/ligt ltligtbananaslt/ligt lt/ulgt
  • Comments are delimited by lt!-- and --gt
  • lt! this is a comment --gt
  • Comments may span multiple lines

ltbodygt lt!-- this is a comment --gt lt/bodygt
Special characters
ltbodygt A ltemgt lt fascinating gt lt/emgt subject
that I ltstronggtmnbspunbspsnbsptlt/stronggt
understand lt/bodygt
  • Some characters such as lt, gt, " and have
    special meanings.
  • To prevent them being interpreted as HTML code,
    they must be written as follows lt gt quot
  • Blank space is normally ignored in HTML. To
    include a space in your document use nbsp

Links and Images
ltbodygt The Department of lta href"http//www.doc."gt Computing lt/agt is a very
.... lt/bodygt
ltimg src"mypicture.gif" alt"my picture"gt
src attribute specifies the file containing the
image alt attribute specifies the text to be
displayed if the image is not viewed
Colour RGB Model
  • ff0000 (red),
  • 00ff00 (green)?
  • 0000ff (blue)?
  • ffff00 (yellow)?
  • ...
  • 3395ab (a pastel blue)?

ltbody bgcolor"994422"gt
ltbody text"994422"gt
ltbody background"tileimage.gif"gt
  • Server-based programs may return data to the
    client as a web page
  • Client-side scripts can read input data
  • To validate the data, prior to sending to server
  • To use in local processing which may output web
    page content that is displayed on the client

Example applications
  • Questionnaires to provide feedback on a web site
  • e-commerce, to enter name, address, details of
    purchase and credit-card number
  • request brochures from a company
  • make a booking for holiday, cinema etc.
  • buy a book, cd, etc
  • obtain a map giving directions to a shop
  • Run a database query and receive results (an
    important part of e-commerce)?

Input types
  • text
  • checkbox
  • radio (buttons)?
  • select (options)?
  • textarea
  • password
  • button
  • submit
  • reset
  • hidden
  • file
  • image

The method and action attributes
  • The method attribute specifies the way that form
    data is sent to the server program
  • GET appends the data to the URL
  • POST sends the data separately
  • The action attribute specifies a server program
    that processes the form data (often as a URL)?

ltbodygt ltform method"POST" action"comments.php"gt
lth2gtTell us what you thinklt/h2gt lt!-- etc
--gt lt/formgt lt/bodygt
Text, checkbox and Radio button
  • The type attribute specifies the type of user
  • The name attribute gives an identifier to the
    input data

ltform method"POST" action"comments.php"gt
lth2gtTell us what you thinklt/h2gt Name ltinput
name"name" type"text size"20"gtltbrgt Address
ltinput name"address" type"text"
size"30"gt lt/formgt
How did you hear about this web site?ltbrgt A
friend ltinput type"checkbox" namename"
value"friend"gtltbrgt Search engine ltinput
type"checkbox" namename" value"engine"gtltbrgt
How did you hear about this web site?ltbrgt A
friend ltinput type"radio" namename"
value"friend"gtltbrgt Search engine ltinput
type"radio" namename" value"engine"gtltbrgt lt!
etc --gt
The input element type"submit/reset and
select element
Thank youltbrgt ltinput type"submit" name"send"
value"Send"gt ltinput type"reset" name"clear"
How do you rate this site?ltbrgt ltselect
name"rating"gt ltoptiongtGood ltoption
selectedgtBad ltoptiongtUgly lt/selectgt
lttable border"1"gt lttrgt ltthgtNamelt/thgt lttdgtA
B Morganlt/tdgt lttdgtD P Joneslt/tdgt lt/trgt lttrgt
ltthgtCourselt/thgt lttdgtFishinglt/tdgt
lttdgtSailinglt/tdgt lt/trgt lttrgt ltthgtYearlt/thgt
lttdgt8lt/tdgt lttdgt5lt/tdgt lt/trgt lttrgt lt/tablegt
  • lttablegt main element
  • lttrgt table row
  • ltthgt table header
  • lttdgt table data

The align and width attributes
  • The align attribute determines the position of
    the text within a cell
  • The width attribute determines the width of the
    row relative to the table

lttable border"1" align"center"gt lttrgt ltth
colspan"2" width"60"gtNamelt/thgt ltth
rowspan"2"gtCourselt/thgt ltth rowspan"2"gtYearlt/th
gt lt/trgt lttrgt ltthgtLastlt/thgt
lt/trgt lttrgt lttdgtMorganlt/tdgt lttdgtABlt/tdgt
lttdgtFishinglt/tdgt lttd align"center"gt5lt/tdgt
lt/trgt lt! etc --gt
Table attributes
  • Table attributes
  • align alignment relative to the page
  • width in pixels or percentage of page width
  • border - width of border (pixels)?
  • cellspacing separation between cells (pixels)?
  • cellpadding - space around data inside cell
  • bgcolor - background colour (inside cells)?
  • Furthermore
  • The ltcaptiongt element puts a title above the table

Table attributes
lttable border"3" align"center" cellspacing"6"
cellpadding"6" bgcolor"cyan"gt ltcaptiongt
lth2gtCourse Datalt/h2gt lt/captiongt lttrgt
ltthgtNamelt/thgt ltthgtCourselt/thgt ltthgtYearlt/thgt
lt/trgt lttrgt lttdgtA B Morganlt/tdgt
lttdgtFishinglt/tdgt lttdgt5lt/tdgt lt/trgt lt! etc --gt
Frames and Framesets
  • A frameset partitions a web browser window so
    that multiple web documents can be displayed
  • Example application To maintain a permanently
    visible directory of links within your site,
    while also displaying one or more selected
    documents from the site.

lthtmlgt ltheadgtlttitlegtFrames 1lt/titlegtlt/headgt
ltframeset cols"140,"gt ltframe name"navF"
src"navigation.html"gt ltframe name"mainF"
src"intro.html"gt lt/framesetgt lt/htmlgt
  • The frameset element replaces the body element
  • frameset has attributes cols or rows, defined in
    terms of pixels, percentage() or unspecified ()
  • this splits the window into two or more columns
    or rows

  • Some browsers cannot process frames. Alternative
    content should be provided using the noframes

lthtmlgt ltheadgtlttitlegtFrames 1lt/titlegtlt/headgt
ltframeset cols"140,"gt ltframe name"navF"
src"navigation.html"gt ltframe name"mainF"
src"intro.html"gt lt/framesetgt ltnoframesgt
ltbodygt Something here for browsers not
supporting frames lt/bodygt lt/noframesgt lt/htmlgt
  • Styles can be defined
  • Inline styles
  • Global styles
  • Stylesheets (Cascading stylesheets)

lth1 style"color2255ff borderridge"gtInline
ltheadgt lttitlegtStyleslt/titlegt ltstylegt lt!-- h1
color red border thin groove
text-aligncenter --gt lt/stylegt lt/headgt
ltlink rel"StyleSheet" type"text/css" href"URL"gt
  • Simple style rules change the appearance of all
    instances of the associated element
  • A class is a style definition that may be applied
    as and when we choose
  • if we don't want the styles, we don't have to use
  • Simple classes are applied to a single type of
  • Anonymous classes can be applied to any type of

Simple classes
lt/headgt ltstylegt lt!-- h1.fred color
eeebd2 background-color d8a29b border
thin groove 9baab2 --gt lt/stylegt lt/headgt ltb
odygt lth1 class"fred"gtA Simple Headinglt/h1gt
ltpgtsome text . . . some textlt/pgt lt/bodygt
Anonymous classes
lt/headgt ltstylegt lt!-- .fred color
eeebd2 background-color d8a29b border
thin groove 9baab2 --gt lt/stylegt lt/headgt ltb
odygt lth1 class"fred"gtA Simple Headinglt/h1gt ltp
class"fred"gtsome text . . . some
textlt/pgt lt/bodygt
Divisions and spans
  • Rather than applying styles to an element itself,
    we wrap the element in
  • a div element (usually for block elements), or
  • a span element (usually for inline elements)?
  • Any required formatting can then be applied to
    the ltdivgt or ltspangt element.
  • Div and span elements become part of the document
  • In particular, each can have class and id

ltheadgt ltstylegt lt!-- .myclass color
blue background cyan text-decoration
underline border thin groove red --gt
lt/stylegt lt/headgt ltbodygt ltdiv class"myclass"gt
lth2gtA Simple Headinglt/h2gt ltpgtsome text . . .
lt/pgt lt/divgt lt/bodygt
  • Styles can be applied to blocks of HTML code
    using div

  • spans are similar to divisions

ltheadgt ltstylegt lt!-- .myclass color red
background cyan text-decoration none
--gt lt/stylegt lt/headgt ltbodygt ltspan
class"myclass"gt lth2gtA Simple Headinglt/h2gt
ltpgtsome text . . . lt/pgt lt/spangt lt/bodygt
  • By now you should be able to use
  • Tables
  • Frames
  • Stylesheet CSS
  • Inline style
  • Embedded style
  • External style

Typical exam question
  • explain why is it important to separate the
    content from the style.
  • what is CSS?
  • State three ways in which styles can be used. And
    explain the advantages and disadvantages of each

  • Look at the disadvantages of html
  • XML
  • Well formed vs valid xml document

Useful sites
  • http//
  • http//
  • http//
