Title: CS-422 Dr. Mark L. Hornick
1Markup languages
- HTML is one type of markup language
- Created in early 1990s by Tim Berners-Lee as a
foundation element of the World-Wide Web - The motivation was to create a means of
distributing documents within the research
community - Markup languages were created in the 1960s
within the publishing industry as a means of
describing page layouts
2HTMLHypertext Markup Language
- HTML tags (markup) indicate what kind of
structure your html document contains - Valid HTML documents incorporate a well-defined
structure - The rules of HTML specify the sequence and order
in which the markup tags appear in a valid HTML
document - Check validity of a website at http//validator.w3
.org
3Basic HTML document structure
Tells the browser the content of the file is HTML
- lthtml xmlns"http//www.w3.org/1999/xhtml"
xmllang"en" lang"en gt - ltheadgt
- ltmeta http-equivContent-Type
contenttext/html charsetISO-8859-1gt - lttitlegtCS422lt/titlegt
- lt/headgt
- ltbodygt
- lth1gtHTML syntax summarylt/h1gt
- lth2gtor, all you need to know about HTMLlt/h2gt
- ltpgtThis is how you write
- an HTML document.lt/pgt
- ltpgtThe end.lt/pgt
- lt/bodygt
- lt/htmlgt
Tells the browser about your web page.
Tells the browser, in this case, were using the
Latin-1 character set.
Tells the browser the title of the page appears
in browser titlebar.
The body encloses everything that appears within
the browser window.
Tells the browser that this is a level-1 header.
Tells the browser that this is a paragraph, or
normal block of text.
4There are two main types of markup language
- Procedural markup focuses on the presentation of
a document - Font face and size specification, centering,
highlighting, etc. - Nroff, troff, TEX, PostScript are procedural
markup languages - Nroff variants were used on early Apple IBM PCs
- Descriptive markup focuses on the structure of a
document - Title, headings, sections, body, etc.
- SGML and XML are systems for defining descriptive
markup languages (more on XML later)
5So, what kind of markup language is HTML?
6History of HTML
- Version 1.0 (1993)
- Not officially a standard at this point
- No graphics support for images
- Version 2.0 (1995)
- Graphics support introduced
- Version 3.x (1995-1997)
- Days of Browser Wars and proprietary extensions
- Mixed procedural and descriptive markup
7History of HTML
- Version 4.0 (12/1997)
- W3C tries to bring order to the chaos
- Separates descriptive structure (HTML) from
procedural presentation (CSS) - Version 4.01 (1999)
- W3C proposals adopted by major browser authors
- XHTML Version 1 (2000)
- HTML is a grammar defined as an SGML application
- XHTML is a grammar defined as an XML application
- Same vocabulary as HTML 4.01
- But stricter syntactical rules
- XHTML Version 1.1 (2007)
- Still working draft
8HTML 4.01 and XHTML 1.0 are structural markup
- Describing only the content of a document
- Presentation is left to Cascading Style Sheets
- The web community is still not all on-board
9The impact of XML
- XML grammars are very strict
- Eliminates ambiguity
- No room for sloppiness
- No unclosed tags
- Tags must be lowercase
- What does this mean about how a modern browser
displays legacy HTML code???
10Flavors of HTML and XHTML
- Flavor level of adherence to standards that a
document announces to a browser - Strict, in which deprecated elements are
forbidden - Transitional, in which deprecated elements are
allowed - Frameset, like Strict but where frame-related
elements are allowed - Flavors are indicated to the browser via a
DOCTYPE directive
11The DOCTYPE directive should be the first line of
any (X)HTML document
- It tells the browser the flavor of the (X)HTML
that follows in the remainder of the
documentlt!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML
4.01 Transitional//EN - "http//www.w3.org/TR/HTML4.01/loose.dtd"gt
- lt!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01
Strict//EN - "http//www.w3.org/TR/HTML4.01/DTD/strict.dtd"gt
- lt!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01
Frameset//EN - "http//www.w3.org/TR/HTML1/DTD/frameset.dtd"gt
- lt!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0
Transitional//EN - "http//www.w3.org/TR/xhtml1/DTD/xhtml-transitiona
l.dtd"gt - lt!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0
Strict//EN - "http//www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd
"gt - lt!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0
Frameset//EN - "http//www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.d
td"gt
12ltmetagt directive
- In addition to a DOCTYPE directive, valid (X)HTML
should also contain a ltmetagt directive to
indicate the character encoding - ltmeta http-equiv"Content-Type"
content"text/html charsetISO-8859-1 /gt - This directive should be nested in the ltheadgt
element before the lttitlegt element
13Tags specify the structural elements of a HTML
document
- Syntax
- lttaggtelement content lt/taggt
- The opening and closing tags names match
- The closing tag name is preceded with a /
- Tags must be lowercase
- Case didnt matter in early HTML
- Other html tags can often be nested in the
content - Nested tags should be indented for clarity
14Some tags and their usage
- Block element syntax lttaggtelement contentlt/taggt
- lthtmlgt, ltheadgt, lttitlegt
- ltbodygt, lth1gt, lth2gt, lth3gt, ltpgt
- ltaddressgt - used to provide contact info for the
html doc author - ltpregt - preserves whitespace (spacing and
linebreaks) - ltblockquotegt - lengthy quotations spanning
several paragraphs ltpgt can be inserted inside a
ltblockquotegt element
15Inline vs. Block elements
- Block elements are always displayed as if they
had a linebreak before and after them - ltpgt used to designate a paragraph
- ltblockquotegt used for long quotations
- Inline elements appear within the flow of text on
a page - ltqgt used for short inline quotations
- ltemgt indicates emphasized text
- ltstronggt indicates strongly emphasized text
16Inline elements
- abbr - abbreviation
- acronym - TLAs and such
- cite - citation
- code computer code
- dfn - definition
- em emphasized text
- q short quote
- strong strongly emphasized text
- samp sample (computer) output
- kbd keyboard entry
- var variable name
- sub - subscript
- sup - superscript
17Element Attributes
- Elements may often contain attributes which
provide additional information about the
elements structure - lttag attributevaluegt content lt/taggt
- An attributes value must be enclosed in double
quotes - lth1 idtitle1gt A Computer Haikult/h1gt
18Hyperlink element
- lta hrefurlgt
- The href attribute indicates the url of the
hyperlink - More on attributes later
19Entity references
- Since some characters, like lt and gt are part of
the markup - use entity references when you need to include
them in your content - lt lt
- gt gt
- quot
- apos
- amp
20Empty elements
- Are various elements containing no content, but
instead act like directives - ltbr/gt - to explicitly break a line
- ltimg srcurl altdescription/gt - to insert an
image - BTW src and alt are attributes, not content
21Leftover presentational elements
- b - bold
- big
- i italic
- s or strike strikethrough
- tt teletype
- u underlined
- There is a better way to indicate presentation
CSS
22HTML Comments
- lt!-- this is a comment --gt
- Dont place a comment
- Inside a tag
- Inside another comment
- Leave space after the opening lt!-- and before the
closing --gt