Title: The world wide web
1The world wide web
Chapter 4
2Learning outcomes
- Explain in general terms how web documents are
transferred across the Internet and - What processes are triggered when you click on
hyperlink - Code web pages using HTML and XHTML using style
sheets. - Explain why it is advisable to use XHTML rather
than HTML - Describe some technologies available for dynamic
web pages
3Essential Reading
- Joe Casad, Teach yourself TCP/IP, Ch. 17
- William Buchanan, Mastering The Internet, Ch. 6-8
- Introductory materials on HTML XHTML either a
text book such as - John Shelly, HTML AND CSS explained, or
- http/www.webMonkey.com
- http//www.w3schools.com
4Additional reading
- William Buchanan, Mastering The Internet, Ch.
9-10 - Andrew Tanenbaum, Computer Networks, Ch. 7.3
- Douglas Comer, Computer Netwoks and Networking,
ch. 32-33 - Chuck Masciano and Bill Kennedy, HTML and XHTML
the definitive guide, for reference - http//www.pcnetworkadvisor.com
- Mike Lewis, Understanding Javascript,
June-Jully 2000
5How the web works
- The client-server model
- Client and server operate on machines which are
able to communicate through a network - The server waits for requests from a clients
- Server receives a requests from a client
- Performs a the requested work
- Or lookup the requested data
- And send a response to the client
- Servers file servers, web servers, name servers
- Clients browsers, email clients
6url format
- ltschemegt//ltserver-domain-namegt/ltpathmanegt
- ltschemegt which protocol to use
- http in general
- file which tells the client document is in a
local machine - ftp file transfer protocol
- ltserver-domain-namegt identifies the server system
- i.e. www.doc.gold.ac.uk
- ltpathnamegt tells the server where to find the
file - http//doc.gold.ac.uk/username/index.html
7Web browsers and servers
- A browser is a program that can retrieve files
from the world wide web and render text, images,
or sounds encoded in the files. - i.e. IE, Nescape, Mozilla
- A web server is an application which waits for
client requests, fetches requested documents from
disk and transmits them the client. - i.e Apache
8What happened when you click on hyperlink?
- Determine URL and extract domaine name.
- Use the name server to get IP address (DNS)
- Make a TCP connect to port 80
- And send a request for a web page once the server
has accepted to connection. - The server send the file and releases the TCP
connection - The client displays the document.
9Other possibilities
- The steps in the previous slide are for
displaying a static web page from a remote
machine. - Other possibilities are
- Page is loaded from a local system
- no tcp connection
- url begin with file//...
- The page is dynamically generated by a
client-side script - No tcp connection
- The page is dynamically generated by a
server-side script - The server may carry out other functions
- Secure server
- Check users identity if they are authorised to
access a particular resources
10Stateless connection
- Both client and server release TCP connection
after a page has been transferred. - HTTP1.0 is stateless
- Connections are not persistent
- There is no indication to the server whether new
transactions involve the same client - HTTP 1.1 is persistent
- By keeping track of the client IP addresses
- However, there is no way of identifying a
repeated visits to the site by the same user. - Futhermore, ISPs reallocate IP addresses to
dial-up customers as new user dial in.
11Cookies
- Request the browser to store a small data file
(cookie) on the users hard disk. - Which can serve to identify users only.
- For instance it could contain a key into a
database on the server machine. - Most browsers nowadays allow you to decide
whether or not you want cookies on your machine.
12Introduction to HTML
13What is an HTML File?
- HTML stands for HyperText Markup Language
- An HTML file is a text file containing small
markup tags - The markup tags tell the Web browser how to
display the page - An HTML file must have an htm or html file
extension - An HTML file can be created using a simple text
editor
14Internet - Services
- Email MIME (Multipurpose Internet Mail
Extensions)text (text/html), image, video, etc. - Telnet ssh
- FTP File Transfer Protocol
- Gopher
- IRC Internet Relay Chat
- Newsgroups
- WWW World Wide WebHTTP (Hypertext transfer
protocol) uses a Question-Answer-Scheme, i.e. a
browser sends a request und gets a response from
a server. Note the server does not send out
anything without a request.
15Markup languages
- Suppose we have a document containing only plain
text - We tag certain parts of the document to indicate
what they are and how they should be formatted - This procedure is called marking-up the document
- Tags are usually paired
- e.g. lttitlegtMy Memoirslt/titlegt
- A pair of tags plus their content constitute an
element - Un-paired tags are called empty tags
16Markup languages
- Physical vs Semantic markup
- physical refers to appearance (style) on the page
- semantic refers to structure and meaning
- HTML is the HyperText Markup Language
- HTML is based on SGML (Standard Generalised
Markup Language) which is more complex - HTML has a fixed set of tags but is constantly
evolving, but newer versions are downward
compatible
17Markup languages
- HTML places primary emphasis on structure
- paragraphs, headings, lists, images, links, .
- HTML places secondary emphasis on style (CSS)?
- fonts, colours, .
- HTML does not label the meaning of the text (XML)?
18A basic document
- Every document should start with the following
line
lt!DOCTYPE html PUBLIC -//W3C//DTD HTML 4.0
Transitional//ENgt
- There are three required elements, defined by the
tags lthtmlgt, ltheadgt and ltbodygt
lthtmlgt ltheadgt lttitlegtMy Home Pagelt/titlegt
lt/headgt ltbodygt lth1gtWelcomelt/h1gt
lt/bodygt lt/htmlgt
19Basic structure elements
- first and last tags
- The HEAD section
- must come before the BODY section
- contains generic information about the document
- Elements specified in the HEAD section can
include - title, link, script, style
- The BODY section
- contains the content of the document (text,
images etc) - this content is structured by other tags
20Block elements
- Block elements define sections of text, usually
preceded by a blank line - ltpgtlt/pgt - paragraph
- lth1gtlt/h1gt...lth6gtlt/h6gt - headings
- ltpregtlt/pregt - preserve (original format)?
- ltblockquotegtlt/blockquotegt - indented text
- ltdivgtlt/divgt - division
- used to identify a section of the document that
may be subject to special formatting (for
example, using stylesheets).
21Paragraphs
- Paragraphs ltpgt...lt/pgt
- force a break between the enclosed text and the
text surrounding it - the tagged region of text may be subject to
special formatting - ltp align"center"gtHere is another paragraphlt/pgt
- align is an attribute of the paragraph tag
- center is the value of the align attribute
ltpgthere is a piece of text that has been placed
inside a paragraphlt/pgt ltp align"center"gtHere is
another paragraphlt/pgt
22Headings
- Six levels of importance lth1gt...lth6gt
- Use headings to divide document into sections
lthtmlgt ltheadgt lttitlegtHeadingslt/titlegt
lt/headgt ltbodygt lth2gtChapter 1lt/h2gt lth3gt1.
Introductionlt/h3gt This is the introduction
lth3gt2. Next sectionlt/h3gt This is the next
section lth4gt2.1 A subsectionlt/h4gt This is a
subsection lt/bodygt lt/htmlgt
23Element relationships
- The elements marked by tags form a hierarchy
- The root element is html (marked by
lthtmlgt...lt/htmlgt)? - It usually has two children head and body
- each of these are further subdivided
- There are rules for which elements can contain
other elements - e.g. headers cannot contain headers
- see http//www.w3.org/ for a full list of rules
- Elements must not overlap each other
- we cannot have lth1gt...lta..gt ... lt/h1gt...lt/agt
- we can have lth1gt...lta..gt ... lt/agt...lt/h1gt
24Inline descriptive elements
- Descriptive elements affect the appearance of
text depending on how the text is described - ltemgtlt/emgt emphasis, usually with italics
- ltstronggtlt/stronggt strong, usually with bold
- ltcitegtlt/citegt citation, usually in italics
- ltcodegtlt/codegt usually results in monotype spacing
ltbodygt A ltemgtfascinatinglt/emgt subject that I
ltstronggtmustlt/stronggt understand lt/bodygt
25Inline explicit style elements
- ltboldfacegtlt/boldfacegt
- ltbiggtlt/biggt bigger font than surrounding text
- ltsmallgtlt/smallgt smaller font than surrounding
text - ltigtlt/igt italics
- ltsgtlt/sgt strikethrough
- ltsubgtlt/subgt subscripts
- ltsupgtlt/supgt superscripts
- ltspangtlt/spangt delimits text for stylesheet
control - ltdivgtlt/divgt delimits blocks of text for
stylesheet control
26Inline explicit style elements
- ltfontgt attributes
- face - name of font (must be installed)?
- "arial", "times", "verdana", "helvetica"
- size - absolute size (1-7), or relative to
previous text - "2", "5", "7", "1", "-2"...
- color - hexadecimal RGB, or a named color
- "3399dd", "blue", "red"
- weight - boldness from 100, 200, ..., 900
- "100", "300", "900"
- e.g.
ltfont face"arial" size"1" color"pink"
weight"300"gt
27Ordered and Unordered Lists
some normal text ltolgt ltligtappleslt/ligt ltligtorangeslt
/ligt ltligtpearslt/ligt ltligtbananaslt/ligt lt/olgt
some normal text ltulgt ltligtappleslt/ligt ltligtorangeslt
/ligt ltligtpearslt/ligt ltligtbananaslt/ligt lt/ulgt
28Comments
- Comments are delimited by lt!-- and --gt
- lt! this is a comment --gt
- Comments may span multiple lines
ltbodygt lt!-- this is a comment --gt lt/bodygt
29Special characters
ltbodygt A ltemgt lt fascinating gt lt/emgt subject
that I ltstronggtmnbspunbspsnbsptlt/stronggt
understand lt/bodygt
- Some characters such as lt, gt, " and have
special meanings. - To prevent them being interpreted as HTML code,
they must be written as follows lt gt quot
amp - Blank space is normally ignored in HTML. To
include a space in your document use nbsp
30Links and Images
Link
ltbodygt The Department of lta href"http//www.doc.
gold.ac.uk/index.html"gt Computing lt/agt is a very
.... lt/bodygt
images
ltimg src"mypicture.gif" alt"my picture"gt
src attribute specifies the file containing the
image alt attribute specifies the text to be
displayed if the image is not viewed
31Colour RGB Model
- ff0000 (red),
- 00ff00 (green)?
- 0000ff (blue)?
- ffff00 (yellow)?
- ...
- 3395ab (a pastel blue)?
ltbody bgcolor"994422"gt
ltbody text"994422"gt
ltbody background"tileimage.gif"gt
32Forms
- Server-based programs may return data to the
client as a web page - Client-side scripts can read input data
- To validate the data, prior to sending to server
- To use in local processing which may output web
page content that is displayed on the client
33Example applications
- Questionnaires to provide feedback on a web site
- e-commerce, to enter name, address, details of
purchase and credit-card number - request brochures from a company
- make a booking for holiday, cinema etc.
- buy a book, cd, etc
- obtain a map giving directions to a shop
- Run a database query and receive results (an
important part of e-commerce)?
34Forms
Input types
- text
- checkbox
- radio (buttons)?
- select (options)?
- textarea
- password
- button
- submit
- reset
- hidden
- file
- image
35The method and action attributes
- The method attribute specifies the way that form
data is sent to the server program - GET appends the data to the URL
- POST sends the data separately
- The action attribute specifies a server program
that processes the form data (often as a URL)?
ltbodygt ltform method"POST" action"comments.php"gt
lth2gtTell us what you thinklt/h2gt lt!-- etc
--gt lt/formgt lt/bodygt
36Text, checkbox and Radio button
- The type attribute specifies the type of user
input - The name attribute gives an identifier to the
input data
ltform method"POST" action"comments.php"gt
lth2gtTell us what you thinklt/h2gt Name ltinput
name"name" type"text size"20"gtltbrgt Address
ltinput name"address" type"text"
size"30"gt lt/formgt
How did you hear about this web site?ltbrgt A
friend ltinput type"checkbox" namename"
value"friend"gtltbrgt Search engine ltinput
type"checkbox" namename" value"engine"gtltbrgt
How did you hear about this web site?ltbrgt A
friend ltinput type"radio" namename"
value"friend"gtltbrgt Search engine ltinput
type"radio" namename" value"engine"gtltbrgt lt!
etc --gt
37The input element type"submit/reset and
select element
Thank youltbrgt ltinput type"submit" name"send"
value"Send"gt ltinput type"reset" name"clear"
value"Clear"gtltbrgt
How do you rate this site?ltbrgt ltselect
name"rating"gt ltoptiongtGood ltoption
selectedgtBad ltoptiongtUgly lt/selectgt
38Tables
lttable border"1"gt lttrgt ltthgtNamelt/thgt lttdgtA
B Morganlt/tdgt lttdgtD P Joneslt/tdgt lt/trgt lttrgt
ltthgtCourselt/thgt lttdgtFishinglt/tdgt
lttdgtSailinglt/tdgt lt/trgt lttrgt ltthgtYearlt/thgt
lttdgt8lt/tdgt lttdgt5lt/tdgt lt/trgt lttrgt lt/tablegt
- lttablegt main element
- lttrgt table row
- ltthgt table header
- lttdgt table data
39The align and width attributes
- The align attribute determines the position of
the text within a cell - The width attribute determines the width of the
row relative to the table
lttable border"1" align"center"gt lttrgt ltth
colspan"2" width"60"gtNamelt/thgt ltth
rowspan"2"gtCourselt/thgt ltth rowspan"2"gtYearlt/th
gt lt/trgt lttrgt ltthgtLastlt/thgt ltthgtInit.lt/thgt
lt/trgt lttrgt lttdgtMorganlt/tdgt lttdgtABlt/tdgt
lttdgtFishinglt/tdgt lttd align"center"gt5lt/tdgt
lt/trgt lt! etc --gt
40Table attributes
- Table attributes
- align alignment relative to the page
- width in pixels or percentage of page width
- border - width of border (pixels)?
- cellspacing separation between cells (pixels)?
- cellpadding - space around data inside cell
(pixels)? - bgcolor - background colour (inside cells)?
- Furthermore
- The ltcaptiongt element puts a title above the table
41Table attributes
lttable border"3" align"center" cellspacing"6"
cellpadding"6" bgcolor"cyan"gt ltcaptiongt
lth2gtCourse Datalt/h2gt lt/captiongt lttrgt
ltthgtNamelt/thgt ltthgtCourselt/thgt ltthgtYearlt/thgt
lt/trgt lttrgt lttdgtA B Morganlt/tdgt
lttdgtFishinglt/tdgt lttdgt5lt/tdgt lt/trgt lt! etc --gt
42Frames and Framesets
- A frameset partitions a web browser window so
that multiple web documents can be displayed
simultaneously. - Example application To maintain a permanently
visible directory of links within your site,
while also displaying one or more selected
documents from the site.
43Framesets
lthtmlgt ltheadgtlttitlegtFrames 1lt/titlegtlt/headgt
ltframeset cols"140,"gt ltframe name"navF"
src"navigation.html"gt ltframe name"mainF"
src"intro.html"gt lt/framesetgt lt/htmlgt
- The frameset element replaces the body element
- frameset has attributes cols or rows, defined in
terms of pixels, percentage() or unspecified ()
- this splits the window into two or more columns
or rows
44Noframes
- Some browsers cannot process frames. Alternative
content should be provided using the noframes
element
lthtmlgt ltheadgtlttitlegtFrames 1lt/titlegtlt/headgt
ltframeset cols"140,"gt ltframe name"navF"
src"navigation.html"gt ltframe name"mainF"
src"intro.html"gt lt/framesetgt ltnoframesgt
ltbodygt Something here for browsers not
supporting frames lt/bodygt lt/noframesgt lt/htmlgt
45Styles
- Styles can be defined
- Inline styles
- Global styles
- Stylesheets (Cascading stylesheets)
lth1 style"color2255ff borderridge"gtInline
styleslt/h1gt
ltheadgt lttitlegtStyleslt/titlegt ltstylegt lt!-- h1
color red border thin groove
text-aligncenter --gt lt/stylegt lt/headgt
ltlink rel"StyleSheet" type"text/css" href"URL"gt
46Classes
- Simple style rules change the appearance of all
instances of the associated element - A class is a style definition that may be applied
as and when we choose - if we don't want the styles, we don't have to use
them - Simple classes are applied to a single type of
element - Anonymous classes can be applied to any type of
element
47Simple classes
lt/headgt ltstylegt lt!-- h1.fred color
eeebd2 background-color d8a29b border
thin groove 9baab2 --gt lt/stylegt lt/headgt ltb
odygt lth1 class"fred"gtA Simple Headinglt/h1gt
ltpgtsome text . . . some textlt/pgt lt/bodygt
48Anonymous classes
lt/headgt ltstylegt lt!-- .fred color
eeebd2 background-color d8a29b border
thin groove 9baab2 --gt lt/stylegt lt/headgt ltb
odygt lth1 class"fred"gtA Simple Headinglt/h1gt ltp
class"fred"gtsome text . . . some
textlt/pgt lt/bodygt
49Divisions and spans
- Rather than applying styles to an element itself,
we wrap the element in - a div element (usually for block elements), or
- a span element (usually for inline elements)?
- Any required formatting can then be applied to
the ltdivgt or ltspangt element. - Div and span elements become part of the document
- In particular, each can have class and id
attributes
50Divisions
ltheadgt ltstylegt lt!-- .myclass color
blue background cyan text-decoration
underline border thin groove red --gt
lt/stylegt lt/headgt ltbodygt ltdiv class"myclass"gt
lth2gtA Simple Headinglt/h2gt ltpgtsome text . . .
lt/pgt lt/divgt lt/bodygt
- Styles can be applied to blocks of HTML code
using div
51Spans
- spans are similar to divisions
ltheadgt ltstylegt lt!-- .myclass color red
background cyan text-decoration none
--gt lt/stylegt lt/headgt ltbodygt ltspan
class"myclass"gt lth2gtA Simple Headinglt/h2gt
ltpgtsome text . . . lt/pgt lt/spangt lt/bodygt
52Summary
- By now you should be able to use
- Tables
- Frames
- Stylesheet CSS
- Inline style
- Embedded style
- External style
53Typical exam question
- explain why is it important to separate the
content from the style. - what is CSS?
- State three ways in which styles can be used. And
explain the advantages and disadvantages of each
one.
54 Next
- Look at the disadvantages of html
- XML
- Well formed vs valid xml document
- XHTML vs HTML
- DHTML
55Useful sites
- http//www.w3schools.com/
- http//www.w3schools.com/html
- http//www.w3schools.com/css